aws
solutions
transformations

How to Optimize Resource Management with AWS ElastiCache Cluster Monitoring

Ron Shemesh

Ron Shemesh

Introduction #

Managing your cloud infrastructure efficiently is crucial for ensuring optimal performance and cost-effectiveness. AWS ElastiCache clusters, which provide in-memory data caching services, are a key component of many cloud architectures. Tracking changes in your ElastiCache node count and types within these clusters is essential for several reasons:
  1. Resource Optimization: Monitoring node counts and types helps you identify underutilized or overprovisioned clusters, allowing you to adjust resources and optimize costs.
  2. Performance Management: Changes in node count and types can impact the performance of your applications. Tracking these changes helps ensure your clusters are properly scaled to meet demand.
  3. Historical Analysis: By tracking node count and type changes over time, you can analyze trends and predict future resource requirements.
With CloudQuery, you can easily export your AWS data to a PostgreSQL database (or any other data destination), enabling comprehensive analysis and visualization. Ensure that the aws_elasticache_clusters table is included in your AWS source configuration. Set the write_mode to append in your PostgreSQL destination configuration to track historical states and changes in the number of nodes and types over time.

How to Track ElasticCache Node Count Changes #

The SQL query below tracks changes in ElasticCachenode count over time for clusters that are part of a replication group, excluding clusters with constant node counts. By focusing on clusters with varying node counts, it helps identify those that require attention due to resource adjustments. The query uses Common Table Expressions (CTEs) to organize the data and identify clusters with significant changes.
-- CTE: timeseriesdata
with timeseriesdata as (
    select
        _cq_sync_time as time,
        replication_group_id,
        region,
        account_id,
        count(*) as node_count
    from
        aws_elasticache_clusters
    where
        replication_group_id != '' or replication_group_id is not null
    group by
        time,
        account_id,
        region,
        replication_group_id
),
-- CTE: sum_changes
sum_changes as (
    select
        sum(timeseriesdata.node_count) as count,
        account_id,
        region,
        replication_group_id,
        time
    from
        timeseriesdata
    group by
        time,
        account_id,
        region,
        replication_group_id
),
-- CTE: identify_changes
identify_changes as (
    select
        replication_group_id
    from
        sum_changes
    group by account_id, region, replication_group_id
    having
        count(distinct count) > 1
)
-- Final Selection
select
    timeseriesdata.*
from
    timeseriesdata
where
    timeseriesdata.replication_group_id in (select distinct replication_group_id from identify_changes)
order by
    time;
This query identifies AWS ElastiCache clusters with node count changes over time, excluding those with constant counts. By tracking these changes, users can optimize resources, manage costs, and ensure their clusters are properly scaled for performance.
Steps:
  1. CTE: timeseriesdata
  • This CTE aggregates the number of nodes in each cluster over time.
  • It counts the number of nodes (node_count) in each cluster and groups the data by time, account_id, region, and replication_group_id.
  • The where clause ensures that only clusters with a valid replication_group_id are considered, ensuring that we focus on clusters that are part of a replication group.
  1. CTE: sum_changes
  • This CTE calculates the total node count for each cluster at each point in time.
  • It sums the node_count from the timeseriesdata CTE and groups the results by time, account_id, region, and replication_group_id.
  1. CTE: identify_changes
  • This CTE identifies clusters that have experienced changes in node count.
  • It groups the data by account_id, region, and replication_group_id.
  • The having clause ensures that only clusters with more than one distinct node count over time are selected, indicating a change in node count.
  1. Final Selection
  • The final select statement retrieves all columns from the timeseriesdata CTE.
  • It filters the results to include only the clusters identified in the identify_changes CTE.
  • The order by clause sorts the results by time to provide a chronological view of the node count changes.

Example Result #

timereplication_group_idregionaccount_idnode_count
2024-07-01 00:00:00rep_group_3us-east-1account_12
2024-07-01 00:00:00rep_group_2us-east-1account_11
2024-07-01 00:00:00rep_group_1us-east-1account_12
2024-07-01 01:00:00rep_group_2us-east-1account_12
2024-07-01 01:00:00rep_group_1us-east-1account_12
2024-07-01 02:00:00rep_group_1us-east-1account_11
2024-07-02 00:00:00rep_group_3us-east-1account_11
2024-07-02 00:00:00rep_group_1us-east-1account_11
2024-07-02 00:00:00rep_group_2us-east-1account_11
2024-07-02 01:00:00rep_group_1us-east-1account_11
2024-07-02 01:00:00rep_group_3us-east-1account_11
2024-07-02 01:00:00rep_group_2us-east-1account_11
2024-07-02 02:00:00rep_group_2us-east-1account_11
2024-07-02 02:00:00rep_group_1us-east-1account_11
The table above shows the historical data for clusters with changes in node count. Each row represents a record of the node count for a specific replication_group_id at a given time. This allows you to see how the node count has varied over time for each replication group, helping you identify trends and make informed decisions about resource allocation.

How to Track New ElasticCache Node Types Each Week #

Another useful query tracks changes in the types of nodes each week, showing only the new types of nodes introduced each week compared to the previous week. Again, use write_mode: append to save the historical states.
-- CTE: cluster_weeks
with cluster_weeks as (
    select
        date_trunc('week', _cq_sync_time) as week,
        cache_node_type,
        region,
        account_id
    from
        aws_elasticache_clusters
    where
        cache_node_type != ''
    group by
        week,
        cache_node_type,
        region,
        account_id
),
-- CTE: new_types
new_types as (
    select
        current_week.week as current_week,
        current_week.cache_node_type,
        current_week.region,
        current_week.account_id
    from
        cluster_weeks current_week
    left join
        cluster_weeks previous_week
    on
        previous_week.week = current_week.week - interval '1 week'
        and previous_week.cache_node_type = current_week.cache_node_type
        and previous_week.region = current_week.region
        and previous_week.account_id = current_week.account_id
    where
        previous_week.cache_node_type is null
)
-- Final Selection
select
    current_week,
    cache_node_type,
    region,
    account_id
from
    new_types
order by
    current_week, cache_node_type, region, account_id;
This query identifies new node types introduced each week for AWS ElastiCache clusters. By tracking these changes, users can identify newly introduced node types and manage resources more effectively.
  1. CTE: cluster_weeks
  • This CTE aggregates the node types for each cluster by week.
  • It selects the synchronization time (_cq_sync_time) truncated to the start of - the week (week), along with cache_node_type, region, and account_id.
  • It groups the data by week, cache_node_type, region, and account_id.
  1. CTE: new_types
  • This CTE identifies the new node types introduced each week by comparing the current week to the previous week.
  • It selects data from cluster_weeks for the current week (current_week) and performs a left join with cluster_weeks for the previous week (previous_week) based on week, cache_node_type, region, and account_id.
  • The where clause ensures that only node types not present in the previous week are included.
  1. Final Selection
  • The final select statement retrieves the current_week, cache_node_type, region, and account_id from the new_types CTE.
  • The order by clause sorts the results by current_week, cache_node_type, region, and account_id.

Example Result #

current_weekcache_node_typeregionaccount_id
2024-01-01 00:00:00cache.m5.largeus-east-1account2
2024-01-01 00:00:00cache.t2.microus-west-1account1
2024-01-08 00:00:00cache.r5.largeus-west-2account3
2024-01-15 00:00:00cache.t3.mediumus-west-1account1
The table above displays the new types of cache nodes introduced each week. Each row represents a new node type detected for a specific week, region, and account. This helps you identify changes in the types of resources being used, allowing for better tracking and management of resource updates and optimizations.

How to Track New ElastiCache Node Types by Day #

If you want to check the introduction of new node types on a daily basis rather than weekly, you can use the following query. This provides a more granular view of changes in your ElastiCache clusters, helping you monitor and react to changes more promptly.
WITH cluster_days AS (
    SELECT
        date_trunc('day', _cq_sync_time) AS day,
        cache_node_type,
        region,
        account_id
    FROM
        aws_elasticache_clusters
    WHERE
        cache_node_type != ''  -- Filter out empty cluster types
    GROUP BY
        day,
        cache_node_type,
        region,
        account_id
),
new_types AS (
    SELECT
        current_day.day AS current_day,
        current_day.cache_node_type,
        current_day.region,
        current_day.account_id
    FROM
        cluster_days current_day
    LEFT JOIN
        cluster_days previous_day
    ON
        previous_day.day = current_day.day - interval '1 day'
        AND previous_day.cache_node_type = current_day.cache_node_type
        AND previous_day.region = current_day.region
        AND previous_day.account_id = current_day.account_id
    WHERE
        previous_day.cache_node_type IS NULL  -- Identify new types
)
SELECT
    current_day,
    cache_node_type,
    region,
    account_id
FROM
    new_types
ORDER BY
    current_day, cache_node_type, region, account_id;

Summary #

Tracking ElastiCache node count and type changes in AWS ElastiCache clusters is crucial for optimizing resource management, improving performance, and conducting historical analysis. By using CloudQuery to export your AWS data to a PostgreSQL database, you can efficiently analyze and visualize these changes. If you want to gain immediate insights into your AWS environment, try CloudQuery for free) or connect with our engineering team for further assistance).
Subscribe to product updates

Be the first to know about new features.