Mike Elsmore

Posted on Mar 1, 2022

Automating your CloudQuery Policies with CircleCI

#devops #circleci #tutorial #security

This blog is going to cover running CloudQuery in CircleCI as part of your continuous integration or continuous delivery pipeline.

We all know how awesome a tool CloudQuery is, with a bit of configuration you very quickly get access to the cloudquery fetch command to get information on your cloud assets, and then you can start running cloudquery policy run <policy> against that to get amazing guidance on your cloud infrastructure. But what about automating the process, or making it accessible to the rest of your team?

You could use our helm chart to deploy a dedicated and persistent version of CloudQuery, or if you want to experiment with you could add this to your CI/CD processes. I personally really like using CircleCI and have been an active user since about 2017, so for this example, we’ll be creating a CircleCI workflow (and template) to achieve this.

What is CI/CD?

CI, or continuous integration, in its simplest form is the means of continually adding and integrating code into a shared body of code. In Git terms, this is the creation of changes in a branch, followed by the testing, and then merging into the main branch.

CD, or continuous delivery, is the methodology by which you can deploy your codebase at any time. This means making sure that the codebase is tested, and has an automated release process, rather than manually grabbing the code and loading it to a location.

How to add CloudQuery to CircleCI?

Prerequisites

Make sure you’ve followed one of our Getting Started Guides to make sure you have a valid and functional config.hcl for CloudQuery to use.

Getting started with CircleCI

If you’ve never used CircleCI before, rather than walk through all the steps here I’d recommend reading through the Your First Green Build guide in the CircleCI documentation. This guide will take you through the creation of your .circleci/config.yml and what the core parts are.

Once you’ve got that started, here is how to add CloudQuery to your CircleCI flow.

The first step is to add to the list of jobs, each of these jobs is a distinct task and process to complete. Here is the job to fetch data to a Postgres instance:

# Define a job to be invoked later in a workflow.
# See: https://circleci.com/docs/2.0/configuration-reference/#jobs
jobs:
  build:
    # Specify the execution environment. You can specify an image from Dockerhub or use one of our Convenience Images from CircleCI's Developer Hub.
    # See: https://circleci.com/docs/2.0/configuration-reference/#docker-machine-macos-windows-executor
    docker:
      - image: cimg/base:stable
        environment:
          CQ_DSN: postgresql://postgres:pass@localhost/postgres?sslmode=disable
      - image: postgres:11
        environment:
          POSTGRES_USER: postgres
          POSTGRES_PASSWORD: pass
          POSTGRES_DB: postgres
    # Add steps to the job
    # See: https://circleci.com/docs/2.0/configuration-reference/#steps
    steps:
      - checkout
      - run:
          name: "Install CloudQuery"
          command: curl -L https://github.com/cloudquery/cloudquery/releases/latest/download/cloudquery_linux_x86_64 -o cloudquery && chmod a+x cloudquery
      #  Wait for Postgres to be ready before proceeding
      - run:
          name: "Waiting for Postgres to be ready"
          command: dockerize -wait tcp://localhost:5432 -timeout 1m
      - run: 
          name: "Fetch AWS Resources"
          command: ./cloudquery fetch

When you have a new job, you need to set the container that will execute the task and any supporting containers. In this case, we have an image of cimg/base:stable which is a basic Linux image to execute tasks as our primary container with a supporting postgres:11 image.

We also have to provide the CQ_DSN variable, following this ENV substitution guide, to point to the Postgres instance within the CircleCI job.

Next, we have the steps, these are the commands executed on the primary container to add, configure, and perform tasks. The first task is usually checkout, this command will clone your codebase from your git repository to run against.

The following step is Install CloudQuery, this follows the Download and Install process of adding CloudQuery to a Linux instance and configuring it to be executable.

      - run:
          name: "Install CloudQuery"
          command: curl -L https://github.com/cloudquery/cloudquery/releases/latest/download/cloudquery_linux_x86_64 -o cloudquery && chmod a+x cloudquery

Due to the way CircleCI parallelises its container, it does take a moment for the Postgres image to be ready so we do have a wait step.

      - run:
          name: "Waiting for Postgres to be ready"
          command: dockerize -wait tcp://localhost:5432 -timeout 1m

This step is using dockerize to make sure the Postgres image is available to use before executing any commands against it.

And finally, we run CloudQuery itself.

      - run: 
          name: "Fetch Resources"
          command: ./cloudquery fetch

Using cloudquery fetch the container either uses the .cq information if committed to the repository or downloads a fresh version of the providers configured and grabs the data from your provider to use.

Configuring CircleCI

Now for this to work, you’ll need to add the access keys etc in CircleCI. It’s usually best to do this on a per-project basis. Within CircleCI go to projects, then select the project in question:

From here navigate to Project Settings:

From here select Environment Variables, and from here you can add your ENV variables as a key-value pair:

To make this work, you now need to add a workflows reference. This is how you chain jobs together to accomplish tasks:

# Invoke jobs via workflows
# See: https://circleci.com/docs/2.0/configuration-reference/#workflows
workflows:
  fetch-workflow:
    jobs:
      - build

This workflow fetch-workflow will now execute on every branch pushed to your git repository. But once the fetch is complete the data is lost as the Postgres image is destroyed after use, which isn’t much use for sharing the information gathered from the provider.

Adding a policy check

Once you have added a policy from the CloudQuery Hub or written your Custom Policy, you’ll want to run this nightly or after deployment and to this we can extend the example above quickly:

jobs:
  build:
    […]
  policy:
    # Specify the execution environment. You can specify an image from Dockerhub or use one of our Convenience Images from CircleCI's Developer Hub.
    # See: https://circleci.com/docs/2.0/configuration-reference/#docker-machine-macos-windows-executor
    docker:
      - image: cimg/base:stable
        environment:
          CQ_DSN: postgresql://postgres:pass@localhost/postgres?sslmode=disable
      - image: postgres:11
        environment:
          POSTGRES_USER: postgres
          POSTGRES_PASSWORD: pass
          POSTGRES_DB: postgres
    # Add steps to the job
    # See: https://circleci.com/docs/2.0/configuration-reference/#steps
    steps:
      - checkout
      - run:
          name: "Install CloudQuery"
          command: curl -L https://github.com/cloudquery/cloudquery/releases/latest/download/cloudquery_linux_x86_64 -o cloudquery && chmod a+x cloudquery
      - run:
          name: "Waiting for Postgres to be ready"
          command: dockerize -wait tcp://localhost:5432 -timeout 1m
      - run: 
          name: "Fetch Resources"
          command: ./cloudquery fetch
      - run:
          name: "Check AWS Policy"
          command: ./cloudquery policy run aws --output-dir ./
      - store_artifacts:
          path: ./aws.json
          destination: aws-policy

Our new policy job is the same as the previous build job except it has two additional steps. The first of which is Check AWS Policy step, this is running the entire AWS Policy set against the gathered data.

      - run:
          name: "Check AWS Policy"
          command: ./cloudquery policy run aws --output-dir ./

And because of the use of the --output-dir flag, it’s creating an aws.json for the results of the policy checks.

The final step is storing the aws.json as an artifact so that after the execution of the check, and the subsequent destruction of the image you can still review the results of the policy check:

      - store_artifacts:
          path: ./aws.json
          destination: aws-policy

Early we mentioned running this as a nightly task, CircleCI workflows have a trigger for this:

# Invoke jobs via workflows
# See: https://circleci.com/docs/2.0/configuration-reference/#workflows
workflows:
  fetch-workflow:
    […]
  policy-check:
    triggers:
      - schedule:
          cron: "0 0 * * *"
          filters:
            branches:
              only:
                - main
    jobs:
      - build
      - policy

Once you add the schedule to the triggers you can follow the crontab standard to decide when it’s executed, and any other filters you may want to limit this execution by. This is a legacy method of automating within CircleCI, for a more up-to-date approach that requires a little more configuration I’d use Scheduled Pipelines.

Summary

This blog should have covered all the aspects of using CloudQuery with CircleCI to automate your policy checks, and with that make it shareable with your teammates. If you’d like to discuss other ways to use CloudQuery in CI you can join our Discord and talk to us about your needs. And if we’ve missed anything we’d happily follow up this blog with more details, or even other CI/CD providers.

DEV Community

Automating your CloudQuery Policies with CircleCI

What is CI/CD?

How to add CloudQuery to CircleCI?

Prerequisites

Getting started with CircleCI

Configuring CircleCI

Adding a policy check

Summary

Top comments (0)

Read next

Setting Up a Production-Ready Kubernetes Cluster with RKE2 in vSphere Using Terraform

Block direct access to CloudFront origins with custom headers and AWS WAF

SQL 101 | Chapter 2: Setting Up Your Database Environment

PHP 8.4: Top Features and Improvements