Skip to content

Automating Datadog with Terraform

· 6 min

Datadog is an awesome SaaS monitoring platform. We have 100+ developers leveraging the platform to collect their metrics, create dashboards and send alerts.

Datadog

Datadog

As with anything, if you don’t maintain and clean your tools, after a while things can become a little messy. Dashboards start to get named wildly different things with no standards. Alerts aren’t deleted for decommissioned services. Team names change and alerts are suddenly pointing to a the wrong Slack channel.

Something has to be done to improve the situation. Setting up some standards or rules around Datadog usage can help, but this is a fine line you need to walk between freedom and standardization. Be too strict or harsh on people and they no longer find the tool nice to use, instead thinking of it as pain in the ass.

Take too much freedom away, and you get “shadow IT” situations with people using their own tools or going their own way.

With this in mind, I decided on a few goals:

Deciding on Terraform#

There were a few options around for managing Datadog from code:

I ended up deciding to go with Terraform mainly due to these two reasons:

  1. Being able to create timeboards using the same Terraform DSL / process.
  2. Terraform is also far more widely supported so from a “googling of problems” perspective (Ansible too)

Terraform

Terraform

Terraform with Datadog Basics#

The Datadog Blog recently published a post called Managing Datadog with Terraform.

This will cover the basics to give you an introduction to Terraform. Once you have a read, head back over to this post for some more in-depth usage.

Repository Structure & Separation of Concerns#

With the workflow in mind, I setup the following repository structure:

Terminal window
├── mssql # directory for the application
├── main.tf # contains the Datadog monitors for the application
├── terraform.tfvars # variables to pass into the configuration (these are passed to the inputs)
└── vars.tf # inputs for the terraform configuration
└── teamcity
├── main.tf
├── terraform.tfvarsœ
└── vars.tf

With this structure, you would run the terraform commands from inside the applications directory:

Terminal window
cd mssql
terraform get
terraform plan
terraform apply

The terraform.tfstate file will get stored in the applications directory - which means each application will have its own state file.

The reason for this is “separation of concerns” or reducing your “blast radius”. If you have 100 apps and someone makes a mistake, you don’t want Terraform to nuke the rest of the 100 apps and screw up their configuration or state.

Defining Applications and Teams#

Now we have our repository structure, let’s zoom into a specific application, for example mssql

terraform.tfvars#

The terraform.tfvars is the standard file name for Terraform variables. We will want to use these variables all over the rest of our configuration

mssql/terraform.tf
application_name = "Microsoft SQL Server" # The application name
application_owner = "Database Administrators" # Team that owns the application
notify = ["@pagerduty-mssql"] # Array of destinations for alerts go to

vars.tf#

The vars.tf is the standard file name for Terraform input variable deceleration. This is where we define what variables are allowed to be passed into our main.tf, creating the resources.

mssql/vars.tf
variable application_name {}
variable application_owner {}
variable datadog_api_key {}
variable datadog_app_key {}
variable notify {
type = "list"
}

When you run Terraform, it will automatically find the terraform.tfvars file and use all the variables it knows about.

Terraform will then prompt you to input variables that it isn’t aware of. You can also set Terraform variables using environment variables or pass them in at the command line. More details on variables in Terraform can be found here.

main.tf#

The main.tf is where the actual Terraform resources go.

This file will contain:

You can find the Datadog Terraform Provider documentation here.

Here is the full file:

mssql/main.tf
##################
# PROVIDER #
##################
provider "datadog" {
api_key = "${var.datadog_api_key}"
app_key = "${var.datadog_app_key}"
}
##################
# DATADOG CHECKS #
##################
resource "datadog_monitor" "common_free_disk" {
name = "${var.application_owner} - ${var.application_name} - Common Disk Usage"
type = "metric alert"
message = "${var.application_name} disk usage on {{device.name}} ({{host.name}}) is high. Notify: ${join(" ", var.notify)}"
query = "avg(last_1h):system.disk.in_use{role:mssql-common} by {device,host} > 0.75"
include_tags = true
}
resource "datadog_monitor" "datawarehouse_free_disk" {
name = "${var.application_owner} - ${var.application_name} - Data Warehouse Disk Usage"
type = "metric alert"
message = "${var.application_name} disk usage on {{device.name}} ({{host.name}}) is high. Notify: ${join(" ", var.notify)}"
query = "avg(last_1h):system.disk.in_use{role:mssql-datawarehouse} by {device,host} > 0.95"
include_tags = true
}

The few main concepts for the main.tf file:

Running Terraform#

Now we have our files setup, we can run Terraform.

Terminal window
cd mssql
terraform plan -var datadog_api_key="xxxxx" -var datadog_app_key="xxxxx"

Terraform will now tell you what actions will be taken against Datadog.

Terraform Plan

Terraform Plan

If you are happy with what it is going to do:

Terminal window
terraform apply -var datadog_api_key="xxxxx" -var datadog_app_key="xxxxx"

Terraform Apply

Terraform Apply

With that, you should now have your monitors created in Datadog.

Datadog Monitor

Datadog Monitor

Setting Environment Variables for Terraform#

If you don’t want to have to pass in the datadog variables in each time, you can set the following environment variables:

Terminal window
# Linux
export TF_VAR_datadog_api_key=xxxxxx
export TF_VAR_datadog_app_key=xxxxxx
Terminal window
# Windows - Current Session
$env:TF_VAR_datadog_api_key = "xxxxxx"
$env:TF_VAR_datadog_app_key = "xxxxxx"
# Windows - Permanently
[Environment]::SetEnvironmentVariable("TF_VAR_datadog_api_key", "xxxxxx", "User")
[Environment]::SetEnvironmentVariable("TF_VAR_datadog_app_key", "xxxxxx", "User")

Types of Datadog Monitors#

Datadog provides many types of possible monitors you can create including host, metric, process etc.

Packer Branch Build

Packer Branch Build

Creating monitors for all of them via Terraform requires knowing the query behind the monitor. These queries match up with the Datadog Monitor API.

Here are a few examples:

Wrapping Up#

Terraform is an awesome way to automate your infrastructure and services out of code. Using Terraform to provision Datadog makes it easy to standardize, re-use and update your monitors quickly and easily.

The most important part of using Terraform is the upfront planning. This entails splitting resources into logical groups so the blast radius is small if something does explode.

I created a datadog-terraform-example repository with the code from this blog to get you started.

Good luck automating your Datadog!