DEV Community

Uday R
Uday R

Posted on • Edited on

Refactoring codeshift

This week I spent some time refactoring my project codeshift. I've been meaning to do this for a while but I've been holding back from working on it on my own time so that I have stuff to do when I need to work on it for an assignment.

GitHub logo uday-rana / codeshift

A command-line tool that translates source code files into a chosen programming language.

codeshift

CI

Codeshift is a command-line tool to translate and transform source code files between programming languages.

codeshift tool demo: translating an express.js server to rust

Features

  • Select output language to convert source code into
  • Support for multiple input files
  • Output results to a file or stream directly to stdout
  • Customize model and provider selection for optimal performance
  • Supports leading AI providers

Requirements

  • Node.js (Requires Node.js 20.17.0+)
  • An API key from any of the following providers:
    • OpenAI
    • OpenRouter
    • Groq
    • any other AI provider compatible with OpenAI's chat completions API endpoint

Installation

  • Run npm install -g @uday-rana/codeshift.

  • Run npx codeshift. This will generate a .codeshift.config.toml file in your current directory.

  • In .codeshift.config.toml, set the base URL for your preferred provider and add your API key. It should look something like this:

    # .codeshift.config.toml
    [settings]
    baseUrl="https://openrouter.ai/api/v1"
    apiKey="sk-or-v1-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
    outputFile="xxxxxxxxxxx"
    model=""
    …
    Enter fullscreen mode Exit fullscreen mode

The first thing I wanted to do was to split my gigantic index.js file into smaller modules. I've been wanting to do this for some time now because I noticed it's getting hard to find the different sections of code in the single source file. Doing this took a while because I had to test my program after splitting each section off, but after I was done I felt like a huge weight was lifted off my shoulders - my program was so much easier to understand. I can't understate how much of a difference this made. Isolating the program into different logical components lets you easily identify and work on the part of the logic you need to, without having to worry about the rest of the program.

I also took the opportunity to clean up the logic for assigning a default model, and added a default provider for when the user fails to provide the base URL (I just realized that'd only work if they have the API key for that specific provider, so I think I'm gonna roll that change back..).

Another big chunk of my effort went towards the completion output logic - this part of the project had a lot of duplicate code. I had separate loops for different conditions because I didn't want to place a condition inside the loop because it would be inefficient, but it made reading and maintaining the code a mess. A big part of efficient code is human efficiency - how easy it is to work with. A few extra CPU cycles to run some more conditional checks in a loop won't slow a computer down much, but having to work with 4 separate for loops that all do nearly the same thing will definitely slow down a human. I decided I'd rather prioritize maintainability so I coalesced them into a single loop and extracted it into a function.

Before:

// Ugly code warning!
      let completion;
      try {
        // Send request to AI provider
        completion = await getAIChatStream(prompt, model);
      } catch (error) {
        console.error(`error getting response from provider: ${error}`);
        process.exit(22);
      }
      let promptTokensUsed = 0;
      let completionTokensUsed = 0;
      let totalTokensUsed = 0;
      try {
        // Write to either output file or stdout
        if (outputFilePath) {
          let response = "";
          // Read response stream chunk by chunk
          for await (const chunk of completion) {
            // Concatenate chunk to response
            response += chunk.choices[0]?.delta?.content || "";
            if (chunk?.usage) {
              promptTokensUsed = chunk.usage.prompt_tokens;
              completionTokensUsed = chunk.usage.completion_tokens;
              totalTokensUsed = chunk.usage.total_tokens;
            }
            if (chunk?.x_groq?.usage) {
              promptTokensUsed = chunk.x_groq.usage.prompt_tokens;
              completionTokensUsed = chunk.x_groq.usage.completion_tokens;
              totalTokensUsed = chunk.x_groq.usage.total_tokens;
            }
          }
          fs.writeFile(outputFilePath, `${response}`);
        } else {
          // Read response stream chunk by chunk
          for await (const chunk of completion) {
            // Write chunk to stdout
            process.stdout.write(chunk.choices[0]?.delta?.content || "");
            if (chunk?.usage) {
              promptTokensUsed = chunk.usage.prompt_tokens;
              completionTokensUsed = chunk.usage.completion_tokens;
              totalTokensUsed = chunk.usage.total_tokens;
            }
            if (chunk?.x_groq?.usage) {
              promptTokensUsed = chunk.x_groq.usage.prompt_tokens;
              completionTokensUsed = chunk.x_groq.usage.completion_tokens;
              totalTokensUsed = chunk.x_groq.usage.total_tokens;
            }
          }
          process.stdout.write("\n");
        }
      } catch (error) {
        console.error(`error reading response stream: ${error}`);
        process.exit(23);
      }
      if (tokenUsageRequested) {
        if (
          promptTokensUsed == 0 &&
          completionTokensUsed == 0 &&
          totalTokensUsed == 0
        ) {
          console.error(`\n No Token Usage returned by model.`);
        }
        console.error(
          `\nToken Usage Report:\n`,
          `Prompt tokens: ${promptTokensUsed}\n`,
          `Completion tokens: ${completionTokensUsed}\n`,
          `Total tokens: ${totalTokensUsed}`
        );
      }
    });
Enter fullscreen mode Exit fullscreen mode

After:

// Recently learned this is called dependency injection!
  const writeFunction = outputFilePath
    ? async (completionChunk) =>
        await fs.appendFile(
          outputFilePath,
          completionChunk.choices[0]?.delta?.content || "",
        )
    : (completionChunk) => {
        process.stdout.write(completionChunk.choices[0]?.delta?.content || "");
      };

  try {
    for await (const chunk of completion) {
      await writeFunction(chunk);

      if (tokenUsageRequested) {
        const usage = chunk?.x_groq?.usage ?? chunk?.usage;

        if (usage) {
          tokenUsage.prompt_tokens += usage.prompt_tokens || 0;
          tokenUsage.completion_tokens += usage.completion_tokens || 0;
          tokenUsage.total_tokens += usage.total_tokens || 0;
        }
      }
    }

    if (outputFilePath) {
      await fs.appendFile(outputFilePath, "\n");
    } else {
      process.stdout.write("\n");
    }
  } catch (error) {
    console.error(`error reading response stream: ${error}`);
    process.exit(23);
  }
Enter fullscreen mode Exit fullscreen mode

At one point during refactoring I broke my program when I tried moving the program variable definition into another file and tried importing it in my start file. I didn't look into it too much but it said program.action() (which is the method used to run the program) was undefined so I assume I made a mistake when exporting. Either way, it wasn't a lot of logic so I was fine leaving it in the start file.

After refactoring my code I was asked to squash all of my commits together. I've been squashing commits for a little bit now so I know what to expect, what to do, and especially what not to do. It went pretty smooth - I squashed my commits, rebased my refactoring branch on main, and merged it into main, which led to a clean fast-forward merge (and a giant commit message).

the giant commit message

I think having this level of control over the git history is awesome. It lets you clean up your commits and makes the history so much easier to understand. And what's incredible about Git is that even if you royally screw up, it acts as this safety net so you never lose your work, so you can play around with rebasing and squashing and get used to how they work without having to worry.

Top comments (0)