How to Build a CSPM with Grafana and CloudQuery

Security

Tutorials

How to Build a CSPM with Grafana and CloudQuery

•

Cloud adoption and usage of cloud security tooling have exploded over the last few years. With that comes a need for managing the security posture of cloud computing usage. Grafana and CloudQuery are partnering to showcase how you can build an extensible Cloud Security Posture Management (CSPM) solution to assist with securing cloud infrastructure.

In this tutorial, we will guide you through building a CSPM solution using Docker Compose for a local development environment. You’ll leverage CloudQuery PostgreSQL dbt and Grafana to create an integrated system that simplifies security compliance and monitoring. By following along, you'll learn how to set up each component and build powerful customizable dashboards. You will get practical hands-on experience with these technologies, understanding how they fit together to enhance cloud security and deploy a robust CSPM solution in both local and production environments.

The Architecture of a CSPM #

You can think of a CSPM as having the following three components:

ELT Layer #

The ELT (Extract Load Transform) layer is a crucial component of a CSPM architecture. It extracts data from various cloud resources, loads it into a centralized storage system, and transforms it into a structured format for analysis. This layer ensures that raw data is converted into actionable insights, enabling effective monitoring and compliance. By efficiently handling large volumes of data, the ELT layer supports real-time security assessments and policy enforcement within the CSPM framework. CloudQuery extracts the data from platform APIs and loads it into the Data Warehouse. To do this, CloudQuery uses its plugins to interact with various services’ APIs and store data in destination databases.

Queries and Insights #

Storing the data in a data warehouse based on a solution like Postgres enables users to build complex queries and derive insights from their data. For example, industry standards such as CIS and PCI-DSS define rules and best practices that can be used to indicate the security posture of cloud infrastructure. Using these, you can generate findings and actionable insights on cloud infrastructure that identify risks and vulnerabilities. To make life easier, CloudQuery offers prebuilt transformations using dbt.

Analytics, Alerting, and Visualization #

The third component of a CSPM is the visualization and presentation of the infrastructure data and the queries and actionable insights. This can be useful to prioritize remediation work, gain an overall understanding of security posture across cloud infrastructure, and even help with reporting and compliance. With Grafana’s easy-to-build dashboards, alerting, and visualization features, it is a logical choice here. The fastest way to adopt Grafana is through Grafana Cloud, which includes a scalable managed backend for metrics, logs, and traces. CloudQuery features prebuilt dashboards for Grafana.

Building the CSPM #

Now let’s get into building an integrated CSPM solution using Docker Compose, CloudQuery, PostgreSQL, dbt, and Grafana.

To keep things simple, we’re going to build a local development environment using Docker Compose. However, in a production environment, we recommend that you use the available cloud offerings for each component (CloudQuery Platform, any managed Postgres service, dbt Cloud, and Grafana Cloud) to reduce the operational workload involved with hosting it yourselves.

Note: If you have any questions or encounter an issue when following along with this post, the best place to get help is to join the CloudQuery Community.

Getting Started With CloudQuery #

To get started with CloudQuery, you will need to sign up for a CloudQuery Platform account. Once you have a CloudQuery account, you’ll need to go to Team Settings and then API Keys to generate a key. This key will enable your CloudQuery instance to download plugins and fetch the licenses for any premium plugins (as needed).

Setting up CloudQuery, Postgres, and Grafana with Docker Compose #

Now you have that in place, let’s create a docker_compose.yml file.

Note: You can find final versions of all of these configuration files at the end of this post.

The first thing you need to define is the services:

services:
  cloudquery:
    image: ghcr.io/cloudquery/cloudquery:latest
    environment:
      CLOUDQUERY_API_KEY: YOUR_API_KEY_GOES_HERE
      AWS_ACCESS_KEY_ID: YOUR_ACCESS_KEY_ID
      AWS_SECRET_ACCESS_KEY: YOUR_SECRET_ACCESS_KEY
      AWS_SESSION_TOKEN: YOUR_SESSION_TOKEN
      POSTGRES_USER: postgres
      POSTGRES_PASSWORD: postgres
      POSTGRES_HOST: db
      POSTGRES_DB: cspm
    command:
      - 'sync'
      - '/cloudquery_config.yml'
    configs:
      - cloudquery_config.yml
  db:
    image: postgres
    restart: unless-stopped
    environment:
      POSTGRES_USER: postgres
      POSTGRES_PASSWORD: postgres
      POSTGRES_DB: cspm
    ports:
      - '5432:5432'
    volumes:
      - db:/var/lib/postgresql/data
  dbt:
    image: ghcr.io/dbt-labs/dbt-postgres:1.7.2
    depends_on:
      - db
    environment:
      POSTGRES_USER: postgres
      POSTGRES_PASSWORD: postgres
      POSTGRES_HOST: db
      POSTGRES_DB: cspm
    command:
      - 'run'
  grafana:
    image: grafana/grafana
    restart: unless-stopped
    depends_on:
      - db
    ports:
      - 3000:3000
    volumes:
      - grafana:/var/lib/grafana
configs:
  cloudquery_config.yml:
    file: ./config.yml
volumes:
  db:
    driver: local
  grafana:
    driver: local

In this Docker Compose file, the CloudQuery job is defined, as well as the Postgres database instance, an instance of dbt, and an instance of Grafana. Make sure you replace the CLOUDQUERY_API_KEY as appropriate.

For this example, you’ll be using the AWS plugin for which you’ll need:

AWS_ACCESS_KEY_ID
AWS_SECRET_ACCESS_KEY
AWS_SESSION_ACCESS_TOKEN

These can all be retrieved by following the AWS documentation.

Note: If you want us to cover GCP or Azure CSPMs, let us know on our Community.

Next, in your Docker Compose configuration file, you need to add the configuration declarations for CloudQuery:

configs:
  cloudquery_config.yml:
    file: ./config.yml

This tells Docker Compose to supply the configuration file (that you’ll create in the next step) to the CloudQuery container.

The final piece of the Docker Compose file you need to define is the volumes, which will enable us to maintain our Grafana state and Postgres database even if the containers get stopped.

volumes:
  db:
    driver: local
  grafana:
    driver: local

Building the CloudQuery Config #

Now that you have the Docker Compose file ready, it’s time to write the CloudQuery configuration file. You can pick up the basic configuration for our chosen cloud platforms from the CloudQuery Hub. For this tutorial, you’re using the CloudQuery AWS plugin. In the contents menu on the left-hand side, you’ll see Configuration. If you click that, it’ll bring you down to the basic example. Copy that into a new file called config.yml.

Now, we would recommend that if you want to build a full-fledged CSPM, you will want to use more tables than just the default aws_ec2_instances for your Compliance Dashboards, so replace the tables line with the following:

tables:
  - 'aws_apigateway_rest_api_stages'
  - 'aws_apigatewayv2_api_stages'
  - 'aws_apigatewayv2_api_routes'
  - 'aws_autoscaling_groups'
  - 'aws_codebuild_projects'
  - 'aws_config_configuration_recorders'
  - 'aws_cloudwatch_alarms'
  - 'aws_cloudtrail_trail_event_selectors'
  - 'aws_cloudwatchlogs_metric_filters'
  - 'aws_cloudfront_distributions'
  - 'aws_iam_accounts'
  - 'aws_iam_credential_reports'
  - 'aws_iam_password_policies'
  - 'aws_iam_users'
  - 'aws_ec2_network_acls'
  - 'aws_ec2_security_groups'
  - 'aws_efs_access_points'
  - 'aws_elasticbeanstalk_environments'
  - 'aws_elbv1_load_balancers'
  - 'aws_elbv2_load_balancers'
  - 'aws_rds_clusters'
  - 'aws_sns_subscriptions'
  - 'aws_s3_accounts'

Next, you’ll need a destination plugin, so head back to the CloudQuery Hub, click Explore, and then Destinations. For this example, you’ll be using PostgreSQL, so find that using the search or by scrolling down the list. However, you can sync your AWS data to any other destination, and if your database isn’t there, you can build your own custom plugin! At the bottom of the configuration file, place a new line that contains --- and paste in the example configuration for the Postgres plugin. Which should look something like this:

# enable_api_level_tracing: false
---
kind: destination
spec:
  name: 'postgresql'
  path: 'cloudquery/postgresql'
  registry: 'cloudquery'
  version: 'v8.0.8'
  spec:
    connection_string: 'postgresql://${POSTGRES_USER}:${POSTGRES_PASSWORD}@${POSTGRES_HOST}:5432/${POSTGRES_DB}?sslmode=disable'

And with that, the CloudQuery Config is ready. Now from your terminal, you can run docker compose up. This will start the Postgres and Grafana instances and will run the CloudQuery job until it is complete, but the dbt instance will fail.

Prebuilt Transformations (Queries and Frameworks) #

The dbt container failed because you haven’t given it any configuration yet. So let’s fix that next.

To make this easier, CloudQuery offers many data transformations, including security and compliance frameworks such as PCI_DSS, CIS, and Foundational Security Best Practices as dbt Projects. To start, go to the Transformations section of CloudQuery Hub and select AWS Compliance. Go ahead and download the pack and extract it to your project folder. Next, you’ll need to add a volumes declaration to the dbt element of our Docker Compose file. In this, you’ll tell Docker where to mount this dbt project folder. This should look something like this:

volumes:
  - type: bind
    source: ./cloudquery_transformation_aws-compliance-free_vX.X.X
    target: /usr/app

Note: If you’re copying this sample directly into your Docker Compose file, make sure you set the version number to match the one you’ve downloaded.

Now, you need to provide dbt with a profile. dbt uses profiles to define how to connect to the relevant databases for the project. To do this, add a configs declaration below the volumes one. It should look something like this:

configs:
  - source: dbt-profiles.yml
    target: /root/.dbt/profiles.yml

Finally, you need to define the dbt-profiles.yml file itself:

config:
  send_anonymous_usage_stats: False
  use_colors: True

aws_compliance:
  target: postgres
  outputs:
    postgres:
      type: postgres
      host: "{{ env_var('POSTGRES_HOST') }}"
      user: "{{ env_var('POSTGRES_USER') }}"
      pass: "{{ env_var('POSTGRES_PASSWORD') }}"
      port: 5432
      dbname: "{{ env_var('POSTGRES_DB') }}"
      schema: public
      threads: 4

Run docker compose up dbt to launch Postgres, Grafana, and (re)run CloudQuery and the dbt transformations.

Load Grafana Dashboard #

Now for the dashboard that pulls information dynamically from our PostgreSQL database. Fortunately, CloudQuery provides a range of pre-built dashboards in CloudQuery Hub. As you’re using AWS in this proof of concept, select the AWS Compliance visualization and then click Download Now and extract the zip file.

The Grafana instance that launched as part of our docker-compose should be available at localhost:3000. If you haven’t already, you might need to set a password for the admin account before proceeding. To import the prepared dashboard, select the hamburger menu from the top left of the window and click dashboards. Then click the blue New button on the top right and then Import. In the extracted zip file, you’ll need to navigate to build>aws_compliance>grafana>postgres where you’ll find a file called compliance.json. Drag this into the Upload dashboard JSON file region and click import. This will load your dashboard and default to the Foundational Security Best Practices framework.

Building custom dashboards #

Now, obviously using a dashboard that’s been prepared for you is better than nothing. Ultimately, one of the key benefits of using Grafana is that you have unlimited flexibility in how you display your data. Perhaps you want to see an overview of all the Policy Pass/Fail Distributions at the same time or a dashboard that just shows the failing policy results so you know what to prioritize during the next standup. Or with a little extra work in dbt, you could even build a time series chart to show off your team's improvements. After all, raising the visibility of security issues without sounding like a squeaky wheel is how you get the time and resource allocations you need to fix them.

Conclusion #

By following this tutorial, you have successfully built an extensible Cloud Security Posture Management (CSPM) solution powered by CloudQuery, PostgreSQL, dbt, and Grafana. You’ve learned how to:

Sync AWS infrastructure data using CloudQuery.
Store and manage data in PostgreSQL.
Transform data with dbt.
Visualize data through customizable dashboards in Grafana.

This setup not only enhances your cloud security but also provides valuable insights and flexibility for monitoring compliance.

Ready to get started with CloudQuery? You can download and use CloudQuery and follow along Ready to get started with CloudQuery? You can try out CloudQuery locally with our quick start guide or explore the CloudQuery Platform (currently in beta) for a more scalable solution.

Want help getting started? Join the CloudQuery community to connect with other users and experts, or message our team directly here if you have any questions.

FAQs #

Q: What is CSPM? A: Cloud Security Posture Management (CSPM) is a solution that helps manage the security posture of cloud infrastructure by continuously monitoring, identifying risks, and ensuring compliance with industry standards.

Q: Why use Docker Compose for building the CSPM solution? A: Docker Compose allows for easy setup and management of the different components in isolated containers, simplifying the local development environment.

Q: What role does CloudQuery play in this setup? A: CloudQuery extracts data from AWS infrastructure and loads it into PostgreSQL, making it available for transformation and visualization.

Q: How does dbt fit into the CSPM architecture? A: dbt transforms the raw data stored in PostgreSQL into structured formats that can be easily analyzed and visualized in Grafana.

Q: What kind of visualizations can be created in Grafana with this setup? A: Grafana can create dashboards that show policy compliance status, security posture overviews, and time series charts of infrastructure improvements.

Q: What is the purpose of the CloudQuery configuration file? A: The CloudQuery configuration file specifies which data tables to sync from AWS and the connection details for the PostgreSQL database.

Q: Is there a scalable option for using CloudQuery? A: Yes, you can explore the CloudQuery Platform, currently in beta, for a more scalable solution.

Code Samples #

docker_compose.yml #

services:
  cloudquery:
    image: ghcr.io/cloudquery/cloudquery:latest
    environment:
      CLOUDQUERY_API_KEY: YOUR_API_KEY_GOES_HERE
      AWS_ACCESS_KEY_ID: YOUR_ACCESS_KEY_ID
      AWS_SECRET_ACCESS_KEY: YOUR_SECRET_ACCESS_KEY
      AWS_SESSION_TOKEN: YOUR_SESSION_TOKEN
      POSTGRES_USER: postgres
      POSTGRES_PASSWORD: postgres
      POSTGRES_HOST: db
      POSTGRES_DB: cspm
    command:
      - 'sync'
      - '/cloudquery_config.yml'
configs:
  cloudquery_config.yml:
    file: ./config.yml
  db:
    image: postgres
    restart: unless-stopped
    environment:
      POSTGRES_USER: postgres
      POSTGRES_PASSWORD: postgres
      POSTGRES_DB: cspm
    ports:
      - '5432:5432'
    volumes:
      - db:/var/lib/postgresql/data
  dbt:
    image: ghcr.io/dbt-labs/dbt-postgres:1.7.2
    depends_on:
      - db
    environment:
      POSTGRES_USER: postgres
      POSTGRES_PASSWORD: postgres
      POSTGRES_HOST: db
      POSTGRES_DB: cspm
    command:
      - 'run'
    volumes:
      - type: bind
        source: ./cloudquery_transformation_aws-compliance-free_vX.X.X
        target: /usr/app
    configs:
      - source: dbt-profiles.yml
        target: /root/.dbt/profiles.yml
  grafana:
    image: grafana/grafana
    restart: unless-stopped
    depends_on:
      - db
    ports:
      - 3000:3000
    volumes:
      - grafana:/var/lib/grafana

config.yml #

CloudQuery configuration file.

kind: source
spec:
# Source spec section
name: aws
  path: cloudquery/aws
  registry: cloudquery
  version: "v26.4.0"
  tables:
    - "aws_apigateway_rest_api_stages"
    - "aws_apigatewayv2_api_stages"
    - "aws_apigatewayv2_api_routes"
    - "aws_autoscaling_groups"
    - "aws_codebuild_projects"
    - "aws_config_configuration_recorders"
    - "aws_cloudwatch_alarms"
    - "aws_cloudtrail_trail_event_selectors"
    - "aws_cloudwatchlogs_metric_filters"
    - "aws_cloudfront_distributions"
    - "aws_iam_accounts"
    - "aws_iam_credential_reports"
    - "aws_iam_password_policies"
    - "aws_iam_users"
    - "aws_ec2_network_acls"
    - "aws_ec2_security_groups"
    - "aws_efs_access_points"
    - "aws_elasticbeanstalk_environments"
    - "aws_elbv1_load_balancers"
    - "aws_elbv2_load_balancers"
    - "aws_rds_clusters"
    - "aws_sns_subscriptions"
    - "aws_s3_accounts"
  destinations: ["postgresql"]
---
kind: destination
spec:
  name: "postgresql"
  path: "cloudquery/postgresql"
  registry: "cloudquery"
  version: "v8.0.8"

  spec:
    connection_string: "postgresql://${POSTGRES_USER}:${POSTGRES_PASSWORD}@${POSTGRES_HOST}:5432/${POSTGRES_DB}?sslmode=disable"

dbt-profiles.yml #

config:
  send_anonymous_usage_stats: False
  use_colors: True

aws_compliance:
  target: postgres
  outputs:
    postgres:
      type: postgres
      host: "{{ env_var('POSTGRES_HOST') }}"
      user: "{{ env_var('POSTGRES_USER') }}"
      pass: "{{ env_var('POSTGRES_PASSWORD') }}"
      port: 5432
      dbname: "{{ env_var('POSTGRES_DB') }}"
      schema: public
      threads: 4

CloudQuery

Test CloudQuery's capabilities with a demo

How to Build a CSPM with Grafana and CloudQuery

The Architecture of a CSPM #

ELT Layer #

Queries and Insights #

Analytics, Alerting, and Visualization #

Building the CSPM #

Getting Started With CloudQuery #

Setting up CloudQuery, Postgres, and Grafana with Docker Compose #

Building the CloudQuery Config #

Prebuilt Transformations (Queries and Frameworks) #

Load Grafana Dashboard #

Building custom dashboards #

Conclusion #

FAQs #

Code Samples #

docker_compose.yml #

config.yml #

dbt-profiles.yml #

Related posts

Running Embedded ELT with CloudQuery and Snowpark Container Service