Atmos - Wield Terraform Like a Boss

#terraform #tofu #devops #infrastructureascode

Terraform is great until you have to deal with state. As large state inherently will not scale you find that the more things grow the more state that needs to be managed and connected and otherwise understood.

Atmos is a tool for this and so much more. This article will build on my prior terraform to opentofu encrypted local state example to introduce the use of atmos for deployment of my multi-state project. This time I'll change over to the tofu-encrypted-atmos branch.

What is Atmos?

Atmos is an opinionated infrastructure deployment tool from the great minds at CloudPosse. This team has a long history of releasing incredible Terraform modules and being deeply involved within the DevOps community. Erik has been running a weekly podcast/open office hours to talk all things DevOps for years now that I highly recommend people check out.

Anyway, the point is that makers of this tool are really really good at flinging Terraform and have strong opinions on how to automate it. Atmos is the culmination of experienced gained by being in the trenches with infrastructure automation. It might be comparable to terragrunt, Terraform stacks (public beta), Cisco's stacks, or even the cdktf which also has a stack concept.

Stack Origins: One may say that AWS CloudFormation is the origin story for treating infrastructure deployments as a 'stack'. It spawned AWS CDK which resulted in the libraries used to generate some other CDKs that build upon the stack concept.

Understanding

As always, there is a short learning curve to grok the atmos view of the world. I'll distill it down as best I can (please read more in their docs for a deeper dive). Let us start by level setting on vocabulary as it will help shortcut understanding of how to look at the atmos project structure layout. Here are some terraform terms and their Atmos equivalent.

Terraform	Atmos	Description
root module	component	State lives here
multiple root modules	stack	Multiple components grouped together
group of available root modules	component library	Just a bunch of components in a logical group
child module	child module	same as they ever were

TIP: A stack is effectively an environment.

Atmos makes clever use of terraform workspaces for each component defined in a stack. This is pretty efficient and an entirely seamless use of terraform workspaces that looks a little like this:

In this next diagram we'd have one state element for the localhost, cluster1, and cluster2 components in the dev workspace. We'd also have 1 state element in the baremetal and cluster1 components for the prod workspace. This gives us 5 total state targets when completely deployed.

NOTE: The component library concept allow for and additional vector to parameterize your deployments and allows for dependency mapping between disparate terraform states.

Adopting Atmos

In order to accommodate for atmos I had to allow for some previously git ignored paths and trust that my openTofu encrypted state process was working properly.

NOTE I effectively relinquished the location of state files to atmos for local state file. I did end up adding a quick validation task unit test the state is encrypted for each deployment. task test:state

In my root modules (aka. 'components') I had to remove all traces of the local backend configuration as atmos overwrites it otherwise (causing an endless loop of changed backend state migration approval prompts). The documentation for atmos goes over a slew of different state management schemas that allow for deep customized workflows tailored to an organization's structure.

I also changed the base folders to comply with the atmos way of doing things and ended up with a basic project structure like this:

I made almost no real changes to my modules or my base terraform but I did move them which required some validations. I also had to create the YAML scaffolding. This included the component library definition:

# stacks/catalog/localhost.yaml
components:
  terraform:
    localhost:
      metadata:
        component: "localhost"
    cluster1:
      metadata:
        component: cluster1
      settings:
        depends_on:
          1:
            component: "localhost"
    cluster2:
      metadata:
        component: cluster2
      settings:
        depends_on:
          1:
            component: "localhost"

The stack definition:

# yaml-language-server: $schema=https://atmos.tools/schemas/atmos/atmos-manifest/1.0/atmos-manifest.json
# stacks/deploy/dev.yaml

vars:
  stage: dev

import:
  - catalog/localhost

components:
  terraform:
    localhost:
      vars:
        env: "dev"
        clusters: ["cluster1", "cluster2"]
        secrets_path: "../../../secrets/dev"
    cluster1:
      vars:
        cluster_name: "cluster1"
        key_path: "../../../secrets/dev/"
        kubeconfig: "../../../secrets/dev/cluster1_config"
    cluster2:
      vars:
        cluster_name: "cluster2"
        key_path: "../../../secrets/dev/"
        kubeconfig: "../../../secrets/dev/cluster2_config"

And also the workflow to run them all as a set of additional atmos tasks.

# stacks/workflows/localhost.yaml
name: localhost
description: Bring up and configure a few kind clusters
workflows:
  up:
    description: |
      Bring up the local environment
    steps:
      - command: terraform apply localhost -auto-approve
      - command: terraform apply cluster1 -auto-approve
      - command: terraform apply cluster2 -auto-approve

  down:
    description: |
      Tear it all down
    steps:
      - command: terraform destroy cluster2 -auto-approve
      - command: terraform destroy cluster1 -auto-approve
      - command: terraform destroy localhost -auto-approve

And finally the atmos.yaml definition that points the binary to tofu (installed via mise)

base_path: ""

components:
  terraform:
    base_path: "components/terraform"
    command: "tofu"
    apply_auto_approve: true
    deploy_run_init: true
    init_run_reconfigure: true
    auto_generate_backend_file: false

stacks:
  base_path: "stacks"
  included_paths:
    - "deploy/*.yaml"
  excluded_paths:
    - "**/_defaults.yaml"
  name_pattern: "{stage}"

workflows:
  base_path: stacks/workflows

logs:
  file: "/dev/stderr"
  level: Info

With this in place and a few custom Taskfile.yml tasks the entire infrastructure can be brought up or down with secured local encrypted state via 2 commands.

NOTE: I use the taskfile for kicking things off but all workflows and tasks can simply be done via the atmos cli directly.

task up
task down

Impressions

Overall I'm appreciating this tool succinctly wrapping Terraform state operations into manageable and highly customizable deployments.

Pros

There is an interactive TUI app that will delight you to see when you get your first atmos configuration working properly (if you struggled to get it working that dopamine hit when the tui pops up is incredible...)
There is OPA policy validation as well as jsonschema validation included. I love me some rego!
Just about every aspect can be configured via declarative configuration.
The tool's author's are heavily involved with the community and making regular improvements and significant feature additions.
Can add custom commands and workflows to replace your tasker tool.

Cons

At first, the proposed file structure for Atmos might feel unfamiliar, but that’s only because so many projects lack a consistently enforced, programmatic structure. While any new system comes with a learning curve, these conventions ultimately reduce cognitive load and create efficiencies—especially in a team environment (Stick with it, and it’ll ‘click’ before you know it!).
It felt quite difficult to get my existing deployment working with atmos. Yet I got it all working in an afternoon so 'difficult' may be relative here.
Almost every aspect of a deployment is able to be customized but it often is not readily apparent just where to change things.
Additional workflows will need to be created per-stack that you are deploying to automated the deployment of all the components within.
Just about every aspect can be configured via a slew of YAML.

Interesting

I'm only mildly uncomfortable with the fact that when atmos runs it generates local .tfvar files in semi-deeply buried locations within the component folders. I added this to the .gitignore list as I'm not certain they need to be there (and they do get recreated automatically).
I dig that atmos supports vendoring (akin to caravel's vendir app or the air-gap deployment tool zarf). But this is a manual affair of defining your vendored content.
There are strong tie ins with one of my other favorite declarative manifest deployment tools, helmfile.

I really like this tool and see myself using it to bootstrap some other local infrastructure via Terraform. How about you? What tools are you using to manage your terraform deployments?