Jonas Brømsø

Posted on Mar 4

Baseline a parallel Git Universe/Concept for Easy Searching

#git #codebase #search #navigation

This is a concept I've been thinking about for a while. It's a way to make it easier to search for things across Git repositories. The idea is to create a parallel universe of the Git repository where the files are stored in a way that makes it easier to search for things.

To have a kind of baseline.

The term baseline means:

1 a minimum or starting point used for comparisons
2 a minimum level or value on a scale
3 an imaginary line or standard by which things are measured or compared
4 a set of data used as a basis for comparison or for calculation

So I decided to call this my Git repository baseline.

The Problem

My working repository has as a lot of unfinished business. Repositories are in a variety of states
I am working in a branch, the files are not even staged and this results in conflicts and problems.

A Solution

A solution is to keep an entirely separate directory structure for GitHub repositories, which are in a more stable state.

For my GitHub account I have a primary directory structure, actually I have several, but leave that aside for now, the primary one is for coding:

github-jonasbn/

Next to this I have:

github-jonasbn-baseline/

So all of my coding work I do in the repositories cloned into: github-jonasbn/, and as stated, these can be in varies different states.

In github-jonasbn-baseline/ I only have the main (and master branches) and I do not do any work there. I have this little script to iterate over all of them an pull the latest changes.

#!/bin/bash

# iterate over a list of directories and descend into each one
# and pull the latest changes from the remote repository
for dir in $(ls -d */); do
  echo "Directory: $dir"
  cd "$dir" || exit 1
  echo "Pulling latest changes from remote repository"
  git pull
  cd ..
done

The solution is far from perfect and it is work in progress, but for now it has been working quite well.

I use it at work too, where most of why I do involve in regards to code is navigating and reading code, no so much coding, since I work as a product manager.

With tools like: ag, ripgrep, ack and grep searching the entire code base becomes a breeze.

Yes there are challenges and I do not see changes that are not merged to the main branches, but if I am really looking for something like this, I will dig into the relevant reposities in another tree.

The searching across all of these repositories easily make is easy for me to spot, if we (or I) have:

repeated patterns, good or bad
certain hard-coded values like IP addresses etc.
configurations in need of change, updates to dependencies etc.
references across repositories like API endpoints etc. client and server side of API endpoints

This has proven for me to be a very a very simple strategy for letting me have an overview using the terminal instead of the browser. You can use it for BitBucket, GitHub and GitLab and whatever solution you use and you can mix all of these, since it is just repositories.

I am looking into Git's --mirror and --bare options to clone to see if these should be used and --bare looks like a good candidate.

Additional Problems

The discovery of new repositories is the challenge, currently all are added by hand. This is not a scalable solution and it makes it easy to miss something relevant in new repositories.

I actually just did with a new repository in GitHub, which was relevant due to an integration towards a repository for another component under version control in Bitbucket.

Since I use this for primarily Bitbucket, but also GitHub I would very much like to build a solution that works for both. And then of course I would need to take into account GitLab as well.

I have looked at the documentation for all and it seems as if my little Bash script can be rewritten and extended with capabilities for repository discovery.

Unpopular Opinion

If I was every to participate on a podcast from the Changelog family as a guest, this would be my unpopular opinion.

Spelling correctly is important

Since searching code bases and taking misspellings into account is cumbersome, this problem is not unique for this repository baseline approach, but I bites me once in a while when searching the code base for certain terms, names or strings.

Conclusion

Until now the solution in all it's simplicity has worked like a charm and I have gotten into the habit of running the script while I am fetching my coffee at work in the morning. In the beginning I was watching it run since it was very stimulating and mesmerizing to see all of the changes scolling away on the screen.

I will be working on extended the script, so I have set up a repository for the work as it progresses, for now it only holds the basic Bash script, which was the initial prove of concept.

Do checkout: GitHub: jonasbn/baseline

Moving forward and with the API integration I will be changing from Bash to Go or something along those lines.

Top comments (0)

Some comments may only be visible to logged-in visitors. Sign in to view all comments.

DEV Community

Baseline a parallel Git Universe/Concept for Easy Searching

The Problem

A Solution

Additional Problems

Unpopular Opinion

Conclusion

Top comments (0)

Read next

Building a Customer Support Portal with Strapi, GPT, and Next.js (Part 1)

Gemika’s Enchanted Guide to Iris Dataset with Magic and Machine Learning 🌟🧙‍♂️ (Part #2)

5 Underrated NPM Packages You’re Not Using (But Should Be)

🔥 Unlocking Free AWS Certifications: Your Step-by-Step Guide 🛠️