DEV Community

Fahad Ali Khan
Fahad Ali Khan

Posted on

My First Open Source Contribution

Adding a Token Usage Feature to Tailor4Job

For the first time in my DPS909 - Topics in Open Source Development course, I had the opportunity to contribute to another student’s open-source project. This experience was both exciting and challenging, as I had never worked on someone else’s codebase before. In this blog post, I’ll walk through my contribution process, what I learned along the way, and my thoughts on collaborating in an open-source environment.


The Project: Tailor4Job

The project I contributed to is called Tailor4Job, a command-line tool designed to help candidates tailor their resumes and cover letters to specific job descriptions. The tool uses language models (LLMs) to analyze resumes and cover letters and provide feedback on how they align with job descriptions. My task was to add a feature that displays token usage information.

The Feature: Adding Token Usage Info

When using language models, it’s important to understand how many tokens are being sent in the prompt and how many are returned in the response. This helps manage costs and ensures that you don’t exceed the model's token limit.

My task was to add a command-line flag --token-usage (or -t) that would display this token information. Specifically, the tool needed to output:

  • The number of tokens in the prompt.
  • The number of tokens in the completion (response).
  • The total tokens used (sum of prompt and completion tokens).

Implementation Process

1. Forking the Repository and Setting Up My Environment

The first step was to fork the project repository on GitHub and clone it locally. I created a new branch (issue-5) to work on this feature and avoid making changes directly to the main branch.

git checkout -b issue-5
Enter fullscreen mode Exit fullscreen mode

After setting up my environment and installing the necessary dependencies, I started reviewing the codebase to get familiar with how the project was organized.

2. Understanding the Existing Code

The first challenge was understanding how the project processed files and interacted with the LLM API (Groq in this case). I focused on how input files were read, how the LLM API was called, and where the tool output its results.

I found that the response from the LLM API already included token usage information, but it wasn’t being used. My job was to extract this data and display it when the user provided the --token-usage flag.

3. Adding the --token-usage Flag

I added the --token-usage flag using Python's click library, which was already being used to handle command-line arguments. The flag was simple to implement, but I needed to ensure that the token information was printed to stderr to separate it from the regular output.

Here’s the code snippet for adding the --token-usage flag:

@click.option('--token-usage', '-t', is_flag=True, help='Show token usage information.')
Enter fullscreen mode Exit fullscreen mode

4. Extracting Token Usage Information

The next step was to extract the token usage data from the LLM response. The response object had a usage attribute that included prompt_tokens, completion_tokens, and total_tokens. Here’s how I accessed and printed that information:

if token_usage and hasattr(response, 'usage'):
    token_info = {
        'prompt_tokens': response.usage['prompt_tokens'],
        'completion_tokens': response.usage['completion_tokens'],
        'total_tokens': response.usage['total_tokens']
    }
    click.echo(f"Prompt Tokens: {token_info['prompt_tokens']}", err=True)
    click.echo(f"Completion Tokens: {token_info['completion_tokens']}", err=True)
    click.echo(f"Total Tokens: {token_info['total_tokens']}", err=True)
Enter fullscreen mode Exit fullscreen mode

This code ensures that the token usage details are only displayed when the --token-usage flag is set.

5. Testing the Feature

After adding the feature, I tested it with various input files. Here’s an example of the command I used to test the tool:

python main.py --model llama3-8b-8192 --output tailored_resume.docx --analysis-mode detailed --token-usage GENERAL_RESUME.docx General_Cover_Letter.docx job_description.txt
Enter fullscreen mode Exit fullscreen mode

The output looked like this:

Processing completed successfully
Prompt Tokens: 2150
Completion Tokens: 497
Total Tokens: 2647
Output saved to tailored_resume.docx
Enter fullscreen mode Exit fullscreen mode

Everything worked as expected! The tool correctly displayed the token usage and saved the tailored resume.


Challenges I Faced

The main challenge I faced was understanding the structure of the LLM API response. At first, I tried to access the token data like a dictionary, but I quickly ran into an error ('CompletionUsage' object is not subscriptable). To solve this, I printed the entire response to inspect how the usage attribute was structured and adjusted my code accordingly.

Another challenge was ensuring that the token usage information was printed to stderr rather than the standard output (stdout). This required some learning, as I hadn’t worked much with error streams before.


Learning Outcomes

This project taught me a lot about working with external APIs and integrating new features into existing codebases. Specifically, I learned:

  1. Working with External APIs: I gained a deeper understanding of how to interact with LLM APIs and how they structure their responses, which often include more data than just the generated text.
  2. Handling Errors and Debugging: Debugging API responses and handling errors was a valuable learning experience. I now feel more confident in troubleshooting similar issues in the future.
  3. Collaborating in Open Source: This was my first time contributing to someone else’s project, and I learned how important it is to maintain coding style consistency and not break the original functionality. It’s a skill I’ll continue to develop as I contribute more to open-source projects.

Collaboration

The collaboration process was smooth. I reached out to the project owner to inform them of the feature I was adding. They were helpful and provided feedback when needed. Working on someone else’s project was a unique experience—it felt different from working on my own projects because I had to respect the existing structure and style.


Future Plans for Tailor4Job

While the --token-usage feature was useful, I see more potential for Tailor4Job. One idea I had while working on the project is to improve the tool’s user interface. Right now, it’s a command-line tool, but adding a simple web interface could make it more accessible to a broader audience.

Another idea is to allow more customization in the analysis. For example, the user could specify which sections of the resume or cover letter to focus on, or the tool could provide more granular feedback on specific skills mentioned in the job description.


Links


Final Thoughts

Contributing to an open-source project for the first time was a rewarding experience. I gained valuable technical skills and a better understanding of how open-source projects work. It was satisfying to see my changes improving someone else’s project, and I look forward to contributing more in the future.

If you’re new to open source, I highly recommend giving it a try—it’s a fantastic way to learn and collaborate with other developers.

Top comments (0)