This is a repost from the Software Mastery newsletter. If you like what you see, consider subscribing to get emails delivered right to your inbox!
Welcome to the sixth issue of the Software Mastery newsletter.
In this issue, I want to share how I approach learning a new codebase.
Learning a new codebase can be daunting for anyone, especially when the codebase is large and has been around for years. This is true, whether you’re a university student starting your first internship, or a seasoned industry veteran.
Whenever I have to work in an unfamiliar codebase, I do the following:
Learn about the codebase at a high level.
Identify relevant entry points.
Trace through the code using an IDE.
Make changes.
Learn About The Codebase At A High Level
When learning a new codebase, it’s a good idea to ask a colleague for a quick whiteboarding session and read any existing documentation.
The objective here is to develop an understanding of the big picture. Usually, this means understanding the codebase in terms of boxes and arrows.
Diagrams enable you to start building a mental model of how a system works at a high level.
This understanding helps you identify which parts of a system are relevant to your task and which ones you can learn later.
Identify Relevant Entry Points
To deepen your understanding of a codebase, you have to read code. It isn’t always obvious where to start, however, especially when the codebase is massive.
The key here is to identify entry points relevant to what you’re trying to do. An entry point is the beginning of a code path. For simple programs written in languages like Java, the main
method is an example of an entry point.
For backend services, the easiest way to identify entry points is to figure out what framework is being used. Frameworks enforce a structure on a codebase, which can give you some clues on where to start your exploration.
Here are a few example frameworks and what they can tell you about entry points:
Spring Boot — For RESTful services, all classes annotated with
@RestController
are HTTP request handlers.Django — The
urlpatterns
variable defined in the root URL configuration contains the mapping between URL patterns and views.gRPC — For gRPC frameworks, there should be a
.proto
file somewhere with a namedservice
containing RPC methods clients can invoke.
Trace Through The Code Using An IDE
At this step, the objective is to gain a deeper understanding of what happens in a particular scenario.
One efficient way to navigate through a code path is to use an IDE. Common IDE shortcuts like “jump to definition“ allow you to follow a code path until its conclusion by drilling down the call stack.
To organize your thoughts, consider taking notes. For example, for a full-stack e-commerce platform, your notes might look like this:
A user clicks on the “Purchase” button.
A POST request is sent to the purchase service’s
/v1/purchases
endpoint.Since the purchase service is a Spring Boot application, the entry point is the
PurchaseController
.The purchase controller validates the request body, submits the order to the
Orders
database table, and returns a200 OK
HTTP response.
Make Changes
Now that you’ve figured out the relevant entry points and traced through the code, you should have a rough idea of what’s going on in the code paths you need to modify.
At this point, you need to start getting your hands dirty. As you start implementing, you’ll begin to unearth the things you don’t know, which will help guide your learning.
Codebase knowledge comes from the hard-earned experience of making changes, not just reading. The more changes you make, the more you’ll understand.
Your Turn!
I hope this issue gave you some ideas on navigating unfamiliar codebases.
Remember that getting up to speed with large codebases is a skill that takes a lifetime to master. Don’t get discouraged if you’re having a hard time!
Do you have any tips on getting familiar with codebases? Reply to this email or comment below to let me know!
Wishing you the best,
Sammy
Top comments (0)