CloudQuery News
AWS Cost Transformation is Now Available!
In need of a one-stop solution to tracking your AWS costs and usage? #
Finding valuable and actionable data from the many available AWS sources can be a daunting task. There are many disparate sources within AWS that, while all being valuable on their own, make it difficult to find and house your usage, cost, and utilization data.
That is why we have developed a new AWS Cost Policy. This policy contains transformations that leverage both CloudQuery provided tables and your readily available Cost and Usage Report from AWS. When generating this report, please ensure that
include resource IDs
is selected for compatibility.We have simplified this process utilizing CloudQuery's AWS source, S3 source, and Postgres destination plugins.
With this new set of transformations, you will be able to ensure that your usage and any potential costs incurred are optimized to your needs.
What are some use cases for this policy? #
- Monitoring costs - quickly get a clear snapshot of any cost or expenditure incurred on different AWS resources, this will allow you to make decisions on how to properly optimize and configure AWS resources exactly to your needs
- Monitoring usage - ensure that the services you have activated in your AWS accounts are being utilized properly
- Compute optimization - gain actionable recommendations at a glance from the AWS cost optimizer data made available in your CloudQuery tables
- Usage optimization - gain actionable insights at a glance from the tables created by the policy
Let's walk through a few use case examples together #
Set up #
In order for this policy to be run successfully, you must sync your AWS metadata with the CloudQuery AWS source plugin, Cost and Usage data with the S3 source plugin, and PostgreSQL destination, as mentioned previously.
Download the CloudQuery CLI and use the configs below to sync the data required for this policy. Don't forget to set the specific bucket, region and path to the files for S3 and a connection string to your PostgreSQL destination database.
kind: source
spec:
name: s3
version: v1.3.3
destinations: ['postgresql']
path: cloudquery/s3
registry: cloudquery
tables: ['*']
spec:
bucket: '<BUCKET_NAME>'
region: '<REGION>'
path_prefix: '<PATH_TO_YOUR_FILES>'
---
kind: source
spec:
name: aws
version: v24.3.3
destinations: ['postgresql']
path: cloudquery/aws
tables: ['*']
---
kind: destination
spec:
name: postgresql
path: cloudquery/postgresql
version: 'v7.3.6'
spec:
connection_string: postgresql://postgres:pass@localhost:5432/postgres
Run the sync with
cloudquery sync config.yaml
command.Now that we have everything configured and synced let us run the transformation from our cost policy directory using the command:
dbt run --vars '{"cost_usage_table": "<your table name here>"}'
Once that command has run successfully, you should see views starting with
aws_cost__
or aws_usage
in your database. For the list of all views and their documentation, see the AWS Cost transformation documentation.Example queries #
Top 5 highest spending accounts
Let's say you have a multi-tenant set-up and you want to see which accounts are incurring the most cost. The view
aws_cost__by_account
will house this information. Here is how you can get the top 5 accounts that have incurred the most cost:select * from aws_cost__by_account
order by cost desc
limit 5;
This query will show you the identifier for each of the top spending accounts alongside how much those accounts have spent. Check the image below to view a sample result from this query:
Finding under-utilized resources
Another common stumbling block to gaining clear insight to your AWS costs is finding under-utilized
resources in your AWS accounts that may be incurring costs:
The most used service in all of AWS is its EC2 offerings. However, it can be difficult to keep track of all EC2 instances to ensure that they are all optimized and performing as expected. With this new policy, you will have access to utilization data so you can be confident that your instances are utilized properly.
Here is a ready-made query to unearth potentially under-utilized instances:
SELECT *
FROM aws_cost__by_under_utilized_resources
WHERE service = 'EC2'
ORDER BY cost DESC
limit 10;
This query is slightly more granular than our previous example. You will receive some specific information about your EC2 instances:
service name
- defaulted to 'EC2' for this exampleresource id
- this is the identifier for each specific EC2 instance you havemetric
- this is the actual statistic being tracked, for EC2s that statistic is CPU Utilizationmean usage
- this is the average CPU utilization for each EC2 instancemax usage
- this is the highest CPU utilization for each EC2 instancecost
- this is the total cost incurred by the EC2 instance
Visualizing your data
Along with the transformations, we have included a suite of visualizations using Grafana that leverage your newly transformed AWS costs. You can find documentation on how to set up your own visualization here. For now, let's check out some examples below:
General cost dashboard
Cost per service
Cost trends
Ready to dive deeper? Contact CloudQuery here or join the CloudQuery Community to connect with other users and experts. You can also try out CloudQuery locally with our quick start guide or explore the CloudQuery Platform (currently in beta) for a more scalable solution.
Want help getting started? Join the CloudQuery community to connect with other users and experts, or message our team directly here if you have any questions.
Written by Kevin Rheinheimer
Kevin is a senior data engineer at CloudQuery, specializing in cloud data infrastructure and application development.