The Problem: Too Many Stale Branches
In my company, we manage multiple microservices, each with its own repository. Over time, our repositories accumulated hundreds—sometimes ~800+ stale branches. Many of these were created over the span of 6+ years and were never deleted. This led to several issues:
- Jenkins build auto-start times skyrocketed to ~20 minutes due to multi-branch scanning.
- GitHub API request limits became a concern as our build processes constantly queried branch metadata.
- Codebase clutter made it difficult to navigate active branches.
We needed a safe and automated way to identify and delete these branches without disrupting active development. Enter GitHub CLI (gh) + Bash!
The Solution: Automating Stale Branch Cleanup
To solve this problem, I wrote a Bash script leveraging the GitHub CLI to:
- Fetch all branches in a repository.
- Check their last commit date.
- Log stale branches to a CSV file (for review).
- (Optional) Automatically delete branches older than a specified date.
The Script:
#!/bin/bash
# Ensure gh CLI is installed and authenticated
if ! command -v gh &> /dev/null; then
echo "GitHub CLI (gh) is not installed. Please install it first."
exit 1
fi
# Get repository name (default: current repo)
if [ -z "$1" ]; then
REPO=$(gh repo view --json nameWithOwner -q .nameWithOwner)
else
REPO=$1
fi
OUTPUT_FILE="stale_branches.csv"
echo "branch_name,last_commit_date" > $OUTPUT_FILE
echo "Fetching branches for $REPO..."
branches=$(gh api repos/$REPO/branches --paginate --jq '.[].name')
target_date="2024-01-01T00:00:00Z"
count=0
processed=0
echo "Analyzing branches..."
for branch in $branches; do
((processed++))
echo -ne "Progress: $processed branches processed\r"
last_commit_date=$(gh api repos/$REPO/commits/$branch --jq '.commit.committer.date' 2>/dev/null || echo "")
if [ ! -z "$last_commit_date" ] && [ "$last_commit_date" \< "$target_date" ]; then
echo "$branch,$last_commit_date" >> $OUTPUT_FILE
((count++))
fi
done
echo -e "\nFound $count stale branches (last commit before $target_date)"
echo "Results written to $OUTPUT_FILE"
# Deleting stale branches
if [ $count -gt 0 ]; then
echo "Deleting stale branches..."
deleted=0
failed=0
while IFS=, read -r branch_name last_commit; do
if [ "$branch_name" != "branch_name" ]; then # Skip header
if gh api -X DELETE repos/$REPO/git/refs/heads/$branch_name >/dev/null 2>&1; then
((deleted++))
echo "Deleted: $branch_name"
else
((failed++))
echo "Failed to delete: $branch_name"
fi
fi
done < "$OUTPUT_FILE"
echo "Successfully deleted: $deleted branches"
[ $failed -gt 0 ] && echo "Failed to delete: $failed branches"
fi
echo "Total branches processed: $processed"
How It Works:
- Fetches all branches in the given repository.
- Checks the last commit date for each branch.
- Logs stale branches (last commit before 2024-01-01) to a CSV file.
- Deletes stale branches automatically (if enabled).
By running this script, we cleaned up hundreds of old branches, reducing our Jenkins build start time significantly and staying within GitHub’s API limits.
Lessons Learned & Best Practices
1️⃣ Automate but Review First
Before enabling automatic deletion, review the CSV file. This prevents accidental deletion of branches that might still be needed.
2️⃣ Protect Important Branches
Make sure to exclude main
, develop
, or any critical branches if necessary.
3️⃣ Schedule Regular Cleanup
Stale branches will accumulate again. Automate this script via a scheduled GitHub Action or Jenkins job to run quarterly or semi-annually.
4️⃣ Use GitHub API Efficiently
To avoid hitting GitHub API rate limits, paginate API calls and consider caching results if working with many repositories.
Conclusion
This simple script helped us streamline our GitHub repositories, improve CI/CD performance, and prevent API overuse. If your team struggles with stale branches, give this a try!
👉 What are your strategies for handling stale branches? Let me know in the comments!
Top comments (0)