This blog is going to cover running CloudQuery in CircleCI as part of your continuous integration or continuous delivery pipeline.
We all know how awesome a tool CloudQuery is, with a bit of configuration you very quickly get access to the cloudquery fetch
command to get information on your cloud assets, and then you can start running cloudquery policy run <policy>
against that to get amazing guidance on your cloud infrastructure. But what about automating the process, or making it accessible to the rest of your team?
You could use our helm chart to deploy a dedicated and persistent version of CloudQuery, or if you want to experiment with you could add this to your CI/CD processes. I personally really like using CircleCI and have been an active user since about 2017, so for this example, we’ll be creating a CircleCI workflow (and template) to achieve this.
What is CI/CD?
CI, or continuous integration, in its simplest form is the means of continually adding and integrating code into a shared body of code. In Git terms, this is the creation of changes in a branch, followed by the testing, and then merging into the main branch.
CD, or continuous delivery, is the methodology by which you can deploy your codebase at any time. This means making sure that the codebase is tested, and has an automated release process, rather than manually grabbing the code and loading it to a location.
How to add CloudQuery to CircleCI?
Prerequisites
Make sure you’ve followed one of our Getting Started Guides to make sure you have a valid and functional config.hcl
for CloudQuery to use.
Getting started with CircleCI
If you’ve never used CircleCI before, rather than walk through all the steps here I’d recommend reading through the Your First Green Build guide in the CircleCI documentation. This guide will take you through the creation of your .circleci/config.yml
and what the core parts are.
Once you’ve got that started, here is how to add CloudQuery to your CircleCI flow.
The first step is to add to the list of jobs
, each of these jobs is a distinct task and process to complete. Here is the job to fetch data to a Postgres instance:
# Define a job to be invoked later in a workflow.
# See: https://circleci.com/docs/2.0/configuration-reference/#jobs
jobs:
build:
# Specify the execution environment. You can specify an image from Dockerhub or use one of our Convenience Images from CircleCI's Developer Hub.
# See: https://circleci.com/docs/2.0/configuration-reference/#docker-machine-macos-windows-executor
docker:
- image: cimg/base:stable
environment:
CQ_DSN: postgresql://postgres:pass@localhost/postgres?sslmode=disable
- image: postgres:11
environment:
POSTGRES_USER: postgres
POSTGRES_PASSWORD: pass
POSTGRES_DB: postgres
# Add steps to the job
# See: https://circleci.com/docs/2.0/configuration-reference/#steps
steps:
- checkout
- run:
name: "Install CloudQuery"
command: curl -L https://github.com/cloudquery/cloudquery/releases/latest/download/cloudquery_linux_x86_64 -o cloudquery && chmod a+x cloudquery
# Wait for Postgres to be ready before proceeding
- run:
name: "Waiting for Postgres to be ready"
command: dockerize -wait tcp://localhost:5432 -timeout 1m
- run:
name: "Fetch AWS Resources"
command: ./cloudquery fetch
When you have a new job, you need to set the container that will execute the task and any supporting containers. In this case, we have an image of cimg/base:stable
which is a basic Linux image to execute tasks as our primary container with a supporting postgres:11
image.
We also have to provide the CQ_DSN
variable, following this ENV substitution guide, to point to the Postgres instance within the CircleCI job.
Next, we have the steps
, these are the commands executed on the primary container to add, configure, and perform tasks. The first task is usually checkout
, this command will clone your codebase from your git repository to run against.
The following step is Install CloudQuery
, this follows the Download and Install process of adding CloudQuery to a Linux instance and configuring it to be executable.
- run:
name: "Install CloudQuery"
command: curl -L https://github.com/cloudquery/cloudquery/releases/latest/download/cloudquery_linux_x86_64 -o cloudquery && chmod a+x cloudquery
Due to the way CircleCI parallelises its container, it does take a moment for the Postgres image to be ready so we do have a wait step.
- run:
name: "Waiting for Postgres to be ready"
command: dockerize -wait tcp://localhost:5432 -timeout 1m
This step is using dockerize
to make sure the Postgres image is available to use before executing any commands against it.
And finally, we run CloudQuery itself.
- run:
name: "Fetch Resources"
command: ./cloudquery fetch
Using cloudquery fetch
the container either uses the .cq
information if committed to the repository or downloads a fresh version of the providers configured and grabs the data from your provider to use.
Configuring CircleCI
Now for this to work, you’ll need to add the access keys etc in CircleCI. It’s usually best to do this on a per-project basis. Within CircleCI go to projects, then select the project in question:
From here navigate to Project Settings
:
From here select Environment Variables
, and from here you can add your ENV variables as a key-value pair:
To make this work, you now need to add a workflows
reference. This is how you chain jobs together to accomplish tasks:
# Invoke jobs via workflows
# See: https://circleci.com/docs/2.0/configuration-reference/#workflows
workflows:
fetch-workflow:
jobs:
- build
This workflow fetch-workflow
will now execute on every branch pushed to your git repository. But once the fetch
is complete the data is lost as the Postgres image is destroyed after use, which isn’t much use for sharing the information gathered from the provider.
Adding a policy check
Once you have added a policy from the CloudQuery Hub or written your Custom Policy, you’ll want to run this nightly or after deployment and to this we can extend the example above quickly:
jobs:
build:
[…]
policy:
# Specify the execution environment. You can specify an image from Dockerhub or use one of our Convenience Images from CircleCI's Developer Hub.
# See: https://circleci.com/docs/2.0/configuration-reference/#docker-machine-macos-windows-executor
docker:
- image: cimg/base:stable
environment:
CQ_DSN: postgresql://postgres:pass@localhost/postgres?sslmode=disable
- image: postgres:11
environment:
POSTGRES_USER: postgres
POSTGRES_PASSWORD: pass
POSTGRES_DB: postgres
# Add steps to the job
# See: https://circleci.com/docs/2.0/configuration-reference/#steps
steps:
- checkout
- run:
name: "Install CloudQuery"
command: curl -L https://github.com/cloudquery/cloudquery/releases/latest/download/cloudquery_linux_x86_64 -o cloudquery && chmod a+x cloudquery
- run:
name: "Waiting for Postgres to be ready"
command: dockerize -wait tcp://localhost:5432 -timeout 1m
- run:
name: "Fetch Resources"
command: ./cloudquery fetch
- run:
name: "Check AWS Policy"
command: ./cloudquery policy run aws --output-dir ./
- store_artifacts:
path: ./aws.json
destination: aws-policy
Our new policy
job is the same as the previous build
job except it has two additional steps. The first of which is Check AWS Policy
step, this is running the entire AWS Policy set against the gathered data.
- run:
name: "Check AWS Policy"
command: ./cloudquery policy run aws --output-dir ./
And because of the use of the --output-dir
flag, it’s creating an aws.json
for the results of the policy checks.
The final step is storing the aws.json
as an artifact so that after the execution of the check, and the subsequent destruction of the image you can still review the results of the policy check:
- store_artifacts:
path: ./aws.json
destination: aws-policy
Early we mentioned running this as a nightly task, CircleCI workflows have a trigger
for this:
# Invoke jobs via workflows
# See: https://circleci.com/docs/2.0/configuration-reference/#workflows
workflows:
fetch-workflow:
[…]
policy-check:
triggers:
- schedule:
cron: "0 0 * * *"
filters:
branches:
only:
- main
jobs:
- build
- policy
Once you add the schedule
to the triggers
you can follow the crontab
standard to decide when it’s executed, and any other filters
you may want to limit this execution by. This is a legacy method of automating within CircleCI, for a more up-to-date approach that requires a little more configuration I’d use Scheduled Pipelines.
Summary
This blog should have covered all the aspects of using CloudQuery with CircleCI to automate your policy checks, and with that make it shareable with your teammates. If you’d like to discuss other ways to use CloudQuery in CI you can join our Discord and talk to us about your needs. And if we’ve missed anything we’d happily follow up this blog with more details, or even other CI/CD providers.
Top comments (0)