engineering

Introducing Transformer Plugins

Mariano Gappa

Mariano Gappa

At CloudQuery, we've become pretty good at providing a simple interface for ELT, with a wealth of sources and destinations and comprehensive compliance transformations. However, there has been one feature that the CloudQuery Developer Community has repeatedly been asking for, the ability to make transformations on your data as it's being loaded.
The new Transformer Plugins solve these common requests we’ve seen from our developer community:
  • Removing unneeded fields that waste space in your data destinations.
  • Obfuscate some fields that contain sensitive personally identifiable information (PII) from the user's clients.
  • Prefix all tables created by the sync.
There are also more advanced requests involving customized transformations, sometimes using proprietary transformation stacks.
Today, we're unveiling a new type of plugin alongside Source Plugins and Destination Plugins: the Transformer Plugin! 🤖

Transformer Plugins #

Transformer plugins sit in the middle of the pipeline between sources and destinations, allowing both content and schema transformations on the data as it passes through.
They are supported from version v6.3.0 of the CloudQuery CLI.

Supporting multiple destinations #

Transformers are configured per-destination, so you can apply different transformers to different destinations.

Configuring transformations #

Today, we're releasing the first transformer plugin, enabling users to perform the most requested transformations with a simple YAML-based configuration interface.
Given an AWS source
kind: source
spec:
  name: "aws"
  path: cloudquery/aws
  registry: cloudquery
  version: "v27.8.0"
  destinations: ["postgresql"]
  tables: ["*"]
  spec:
And a Postgres destination
kind: destination
spec:
  name: "postgresql"
  path: "cloudquery/postgresql"
  registry: "cloudquery"
  version: "v8.0.7"
  write_mode: "overwrite-delete-stale"
  transformers:
    - basic # we add the basic transformer here
  spec:
    connection_string: "..."
Using the following transformer, you can transform your data by obfuscating, removing and adding columns, and changing table names in the destination database.
kind: transformer
spec:
  name: "basic"
  registry: cloudquery
  path: "cloudquery/basic"
  version: "v1.0.0"
  spec:
    transformations:
      - kind: obfuscate_columns
        tables: ["aws_secretsmanager_secrets"]
        columns: ["kms_key_id"]

      - kind: remove_columns
        tables: ["aws_secretsmanager_secrets"]
        columns: ["rotation_rules", "policy"]

      - kind: add_column
        tables: ["*"]
        name: "source"
        value: "cq_sync"

      - kind: change_table_names
        tables: ["*"]
        new_table_name_template: "cq_{{.OldName}}"

Advanced transformation use cases #

It's early days, and we're still working on more advanced transformers. As it’s the case for source & destination plugins, we’re also enabling the CloudQuery community to develop their own transformer plugins, with a straightforward interface. Stay tuned for the upcoming guide on developing custom transformer plugins. In the meantime, feel free to reach out to us on our CloudQuery Community Discord with questions and feature requests for our new transformations framework.
Mariano Gappa

Written by Mariano Gappa

Mariano is a software engineer working at CloudQuery with 15 years of experience in the industry. His speciality is in improving performance and his work has reduced sync times and significantly improved CloudQuery's performance.

Start your free trial today

Experience Simple, Fast and Extensible Data Movement.