Working with large repositories can be challenging, especially when you only need a specific directory. Instead of wasting time and storage, here's a guide to efficiently clone sub-directories using git-sparse-checkout
.
1. Clone the Repository with Sparse Checkout
Use the --depth 1
flag to clone only the latest commit, and --filter=blob:none
to avoid downloading file contents initially:
git clone --depth 1 --filter=blob:none https://github.com/danuw/azure-docs.git --sparse
Resulted in following (note the size of 307 kiB downloaded when the repo is gigabytes heavy!!)
Cloning into 'azure-docs'...
remote: Enumerating objects: 8438, done.
remote: Counting objects: 100% (8438/8438), done.
remote: Compressing objects: 100% (7673/7673), done.
remote: Total 8438 (delta 51), reused 4682 (delta 25), pack-reused 0 (from 0)
Receiving objects: 100% (8438/8438), 2.56 MiB | 16.60 MiB/s, done.
Resolving deltas: 100% (51/51), done.
remote: Enumerating objects: 85, done.
remote: Counting objects: 100% (85/85), done.
remote: Compressing objects: 100% (81/81), done.
remote: Total 85 (delta 52), reused 16 (delta 4), pack-reused 0 (from 0)
Receiving objects: 100% (85/85), 307.16 KiB | 5.04 MiB/s, done.
Resolving deltas: 100% (52/52), done.
2. Navigate to the Cloned Repository
Change your directory to the cloned repository:
cd azure-docs
3. Initialize Sparse Checkout Mode
Enable sparse checkout in cone mode, which simplifies the process of selecting specific directories:
git sparse-checkout init --cone
4. Set the Directory to be Checked Out
Specify the directory you want to check out from the repository. In this example, we are checking out the articles/iot-operations directory:
git sparse-checkout set articles/iot-operations
Note: If you encounter an error, ensure that your Git version supports sparse checkout, that the specified directory exists in the repository or that the syntax is adapted to your local system (Windows, MacOS, Linux?...)
After these steps, only the files within the articles/iot-operations directory will be checked out into your local repository, minimizing the data you download and store.
remote: Enumerating objects: 189, done.
remote: Counting objects: 100% (189/189), done.
remote: Compressing objects: 100% (182/182), done.
remote: Total 189 (delta 10), reused 106 (delta 7), pack-reused 0 (from 0)
Receiving objects: 100% (189/189), 10.96 MiB | 21.30 MiB/s, done.
Resolving deltas: 100% (10/10), done.
Updating files: 100% (190/190), done.
Wrap up
This method is particularly useful when storage space is limited or when working with a slow connection. Whether you're cloning a specific lab version or just a part of a broader documentation set (e.g., for LLMs or RAG purposes), this guide offers a simple and effective solution.
Additional Notes
- Originally published as a GitHub Gist.
- For those following Microsoft documentation, this tip may soon be available directly across labs and tutorials, as seen in this example and its PR.https://github.com/dotnet/AspNetCore.Docs/pull/34239)
Top comments (0)