This week I added tests to my project codeshift.
uday-rana / codeshift
A command-line tool that translates source code files into a chosen programming language.
codeshift
Codeshift is a command-line tool to translate and transform source code files between programming languages.
Features
- Select output language to convert source code into
- Support for multiple input files
- Output results to a file or stream directly to
stdout
- Customize model and provider selection for optimal performance
- Supports leading AI providers
Requirements
- Node.js (Requires Node.js 20.17.0+)
- An API key from any of the following providers:
- OpenAI
- OpenRouter
- Groq
- any other AI provider compatible with OpenAI's chat completions API endpoint
Installation
-
Clone the repository with Git:
git clone https://github.com/uday-rana/codeshift.git
- Alternatively, download the repository as a .zip from the GitHub page and extract it
-
In the repository's root directory (where
package.json
is located), runnpm install
:cd codeshift/ npm install
-
To be able to run the program without prefixing
node
, runnpm install -g .
ornpm link
within the project directory:npm install -g
…
To do this, I chose Jest because it is the most popular testing framework for JavaScript and it is a mature technology, meaning there is plenty of great documentation and examples and a large ecosystem around it.
An alternative I was considering was Vitest, because last time I tried using Jest I had to figure out how to deal with TypeScript and ES modules. I was talking to my friend Vinh who wanted to set up Jest on his project too, and his project used ES module syntax, and so he ran into trouble since Jest support for it is still experimental. But since my project uses CommonJS syntax and Jest is still much more widely used, I decided to stick with it.
Setting up Jest
I installed Jest with npm:
npm i -D jest
-
I configured Jest globals in ESLint:
... import globals from "globals"; export default [ ... { languageOptions: { globals: { ...globals.node, ...globals.jest } } }, ... ];
I installed
@types/jest
for VSCode Intellisense:npm i -D @types/jest
-
I created a file called
jest.config.js
and set the verbose option to true. This makes it so that Jest reports on each individual test during the run.
/** @type {import('jest').Config} */ const config = { verbose: true, }; module.exports = config;
Testing LLM functionality
My program uses the OpenAI client to interface with various LLMs, including OpenRouter, Groq, and GPT. To test this functionality, we were encouraged to use an HTTP mocking library like Nock, but it made more sense to me to mock the OpenAI client in Jest using jest.mock
. In order to do this, I had to move the initialization for the OpenAI client to a separate file so that the instance could be imported and mocked in tests.
Learning from tests
While writing my tests, I ended up learning a few things about how my own code worked. For example, I was passing multi-line template literals in my prompt to the LLM, and when testing the prompt building function, I learned that all of the indentation and newlines were being passed to the LLM. After a bit of research I learned the newlines can be escaped with a \
like in a Unix shell, but as for the indentation, there's not much of a choice except to remove all indentation from the literal.
I was using node:fs.stat()
to check whether a config file existed before parsing it with node:fs.readFile()
and it turned out this was redundant because they both throw the same error if the file doesn't exist.
I have a module that selects a default model based on a provider base URL specified in an environment variable. While writing tests for it I was confused by my own logic writing the module. After thinking about it from the perspective of possible test scenarios, I was able to simplify the logic a fair bit which also made it much easier to understand.
Also, when trying to set values on process.env
before each test in order to test the model selection module, I noticed that values on process.env that were set to undefined
or null
would evaluate as truthy. I'm not sure why, but I got around this by delete
ing the values before each test.
I was worried that testing streamed responses from the LLM would be difficult, so without attempting it, I decided to make non-streamed responses the default option and create a flag to request a streamed response. But I was able to test streamed responses successfully - for most tests, I was able to use arrays instead of streams. To test for when reading the stream fails, I used a generator function.
test("Should throw if error occurs reading response stream", async () => {
const errorCompletion = (async function* () {
yield new Error("Stream error");
})();
const exitSpy = jest.spyOn(process, "exit").mockImplementation();
await writeOutput(errorCompletion, "output.txt", true, true);
expect(exitSpy).toHaveBeenCalledWith(23);
exitSpy.mockRestore();
});
While writing tests for the function that handles writing output, my tests helped me catch another edge case I'd missed while hastily making streamed responses optional: handling token usage for non-streamed responses.
if (streamResponse) {
await processCompletionStream(
completion,
outputFilePath,
tokenUsageRequested,
tokenUsage,
);
} else {
// Forgot this part until I realized while writing my tests!
const {
prompt_tokens = 0,
completion_tokens = 0,
total_tokens = 0,
} = completion?.usage || {};
tokenUsage = { prompt_tokens, completion_tokens, total_tokens };
if (outputFilePath) {
await fs.writeFile(outputFilePath, completion.choices[0].message.content);
} else {
process.stdout.write(completion.choices[0].message.content);
}
Conclusion
Even though I've done testing before in other courses and was already familiar with Jest, I'd never read the docs thoroughly or used the mock functionality (working with servers in the past, I'd used superagent), so I learned a lot working on these tests. I think mocking and setup/teardown are incredibly useful features to have when writing tests.
I find testing to be invaluable to ensure no regressions are made to the codebase especially when working on a large project or in a team and it can save tons of time. For my own projects, I like to perform test-driven development, and intend to continue doing so in the future.
That's it for this post. Thanks for reading!
Top comments (0)