How to Build an Open Source CSPM with CloudQuery, PostgreSQL, and Grafana
General Architecture of the CSPM #
- ETL (Extract-Transform-Load) Ingestion Layer: This is the part where data is collected, processed, and loaded into the system. CloudQuery is the tool used here to gather data from various cloud sources and prepare it for storage.
- Datastore: This is where the collected data is stored. PostgreSQL, a type of database, is used to keep all the cloud data organized and accessible.
- Transformations (Policies): After the data is stored, it often needs to be processed or transformed to be useful. This is done using dbt Transformations, which applies rules and policies to the data to make it ready for analysis.
- Data Visualization and Exploration Platform: Finally, to make sense of the data, it’s visualized using Grafana. Grafana helps create dashboards and reports so users can easily explore and understand the cloud data.
What you will get from this setup #
- Raw SQL access to all your cloud asset inventory.
- Multi-Cloud Asset Inventory: Ingest configuration from all your clouds to a single Datastore with a unified structure.
- Avoid 'yet another dashboard' fatigue: Reuse your existing BI/Visualization stack (Grafana in this example) to build an open-source CSPM.
Building the CSPM step-by-step #
Step 1: Install or deploy CloudQuery #
Step 2: Install Grafana #
- Self-hosted (local, docker, k8s): Official guide.
- SaaS/managed: Grafana.com
- AWS Managed Grafana: https://aws.amazon.com/grafana/
Step 3: Install dbt #
profiles.yml
file in the local directory and falls back to ~/.dbt/
. Read more about profiles in the dbt documentation.profiles.yml
file should look like this:config:
send_anonymous_usage_stats: False
use_colors: True
aws_compliance: # this should match the profile name in your dbt_project.yml, see step 5.
target: postgres
outputs:
postgres:
type: postgres
host: "your postgres host"
user: "postgres user name"
pass: "postgres password"
port: 5432
dbname: "database name"
schema: public
threads: 4
Step 4: Configure and run CloudQuery sync #
- aws_cloudwatch_alarms
- aws_cloudwatchlogs_metric_filters
- aws_ec2_network_acls
- aws_ec2_security_groups
- aws_sns_subscriptions
- aws_iam_credential_reports
- aws_iam_password_policies
- aws_iam_user_access_keys
- aws_iam_users
- aws_autoscaling_groups
- aws_cloudtrail_trail_event_selectors
- aws_cloudtrail_trails
- aws_codebuild_projects
- aws_config_configuration_recorders
- aws_apigateway_rest_api_stages
- aws_apigateway_rest_apis
- aws_apigatewayv2_api_routes
- aws_apigatewayv2_api_stages
- aws_apigatewayv2_apis
- aws_cloudfront_distributions
- aws_efs_access_points
- aws_elasticbeanstalk_environments
- aws_elbv1_load_balancers
- aws_elbv2_load_balancer_attributes
- aws_elbv2_load_balancers
- aws_iam_accounts
- aws_rds_clusters
- aws_s3_accounts
Step 5: Run policies (CSPM - Cloud Security Posture Management) #
dbt_project.yml
) resides. Check that the profile: aws_compliance
matches the profile name in your profiles.yml
file that you configured in step 3.dbt compile
dbt run
psql
, and of course visualize them in your favorite BI tool.Step 6: Visualize in Grafana #
dashboard.json
file in the extracted directory (in aws_compliance/grafana/postgres
) and import it into your Grafana instance.Summary #
Written by Yevgeny Pats
Yevgeny Pats is the Co-Founder & CEO at CloudQuery. Prior to establishing CloudQuery, he successfully founded and exited other startups. He has a background in software engineering and cybersecurity.