If you are using Git submodules in your project, you might want to keep them up to date so that their latest version is available. Unfortunately, by design, Git keeps track of the submodules using their commit SHA (ex. eb41d76
) and it must be committed to your repository.
And it seems their is not a way yet to always use the latest submodule commit without updating their reference with a command like git submodule update --remote
... 😐
It is not really convenient when you want to automate your workflow and have to manually checkout the code to update the submodule SHA, right? 🤔
Imagine if you could only checkout your project containing submodules and you have the guarantee that they are at their latest version...😌
I am aware that sometimes, you do not want that a submodule to be able to introduce a breaking change, but if you are 99.99% sure that it cannot break your project, would not it be fun to forget that you have to update their references to use the latest changes? 🤭
It turned out to be not as simple as I would have like, but I managed to make it work. 😇
Solutions
To simplify my explanation, let's says we have two projects: projectA
and projectB
, and projectB
has projectA
as one of its submodules. 😬
#1 - Update the submodules from within
Do you remember the .git:push
job I wrote in my post How to push to a Git repository from a GitLab CI pipeline? Well, I have found another case to use it! 😅
Since I did not want all other jobs to run when I was running this job, I add the following code to other jobs:
By using a pipeline schedule with the $UPDATE_SUBMODULE
set, this solution was working. My submodules have been updated according to the interval I wanted.
Unfortunately, this solution requires that the projectB
schedule run some times after an update to the projectA
, otherwise the latest commit of the projectA
will have to wait for the next schedule to be taken into account...
Since I did not wanted to introduce any delay and I did not want to depend on the condition that my projectA
pipeline be completed in less than X minutes, I keep looking for a better solution. 🔎
#2 - Trigger the pipeline of projectB
from projectA
If we want the projectB
to know when the projectA
has been updated, one solution is for projectA
to notify the projectB
.
Fortunately, GitLab offers a specific keyword to launch a downstream pipeline: trigger. It is pretty straightforward, because you just need to configure two more keywords:
- project : path of the second project
- branch : branch to update of the second project
like in the example below.
Since I do not want to trigger the entire pipeline, I am also able to pass variables from one pipeline to another, like the UPDATE_SUBMODULE
variable used in the previous solution.
There is also another keyword, strategy , that when set to depends
, that links the repositories togethers. This can be useful if you want to know whether or not the second pipeline was successful from the upstream repository.
#3 - Use the GitLab API to trigger the downstream pipeline
I can not remember what I did wrong, but the previous solution did not work for me as I thought it would. I finally managed to fix it, but that was after finding another solution by using a specific GitLab API endpoint:
It is a little more verbose than the previous example, but you just need to provide the ID of the project to update (DOWNSTREAM_PROJECT_ID
) as the other variables are automatically predefined by GitLab CI.
#4 - Use the GitLab API to update directly a submodule
While searching on the Internet, I came across another GitLab API endpoint that did exactly what I wanted, without the need to adapt two pipelines: the Repository submodules API!
As I finally choose this way to update my submodules, I improved the example from the documentation to meet my criteria, so be careful, as the example below may look complicated.
For a much simpler example, I suggest you read the official documentation on the subject. 🛡️
trigger:
image: curlimages/curl:7.76.1
script:
# Put the private-token with a colon into a variable to escape the "Linefeed-Limbo" (https://stackoverflow.com/a/51187502)
- PRIVATE_TOKEN="PRIVATE-TOKEN:"
# Get the current datetime (in local time with the TZ variable)
- CURRENT_DATE="$(date +'%F %T')"
# Add it to each commit message of the repository (and make sure there is no carriage returns)
- MESSAGE="${COMMIT_MESSAGE:-$CI_COMMIT_MESSAGE} (${CURRENT_DATE})"
- MESSAGE=$(echo "${MESSAGE}"|tr -d '\n')
# Use the Repository submodules API to update the submodule reference in the repository
- curl
--data "branch=${BRANCH:-$CI_DEFAULT_BRANCH}&commit_sha=${COMMIT_SHA:-$CI_COMMIT_SHA}&commit_message=${MESSAGE}"
--header "${PRIVATE_TOKEN} ${GITLAB_TOKEN}"
--request PUT
"${CI_SERVER_URL}/api/v4/projects/${PROJECT_ID}/repository/submodules/${SUBMODULE_PATH}"
stage: trigger
rules:
- changes:
- ghost/
- images/
variables:
PROJECT_ID: '20554554'
SUBMODULE_PATH: 'ghost'
TZ: ':America/Toronto'
Now, I am sure sure you want an explication of the above code, right? 😅
First, the heart of the solution is the curl
command, where I call the GitLab endpoint. Let's break down all of the variables used here:
-
SUBMODULE_PATH
: which represents the name of the submodule exactly as it was commited into theprojectB
repository -
PROJECT_ID
: theprojectB
ID -
CI_SERVER_URL
: the URL of GitLab -
GITLAB_TOKEN
: a personal GitLab token, with write permissions on the API -
MESSAGE
: the commit message I want (more explanations below)
There is also a lot of line just to set the MESSAGE
variable, because I want to add the current date to it (in my local time zone). If the COMMIT_MESSAGE
is not provided, it defaults to CI_COMMIT_MESSAGE
which is the latest commit message of the projectA
repository.
Conclusion
You have now four ways to update a Git submodule in a GitLab CI pipeline! Which one will you choose? 😉
Top comments (0)