CloudQuery Integration Configuration
A CloudQuery sync fetches data from cloud accounts (sources) and writes it to one or more destinations. A sync requires at least one source- and one destination configuration. Configuration files are specified in YAML format and can be either split across multiple files or combined.
Learn more about how syncs work and integration architecture.
Example using multiple files
One option is to maintain configuration for your source and destination integrations in separate files.
Here is an example with only one source and one destination integration:
kind: source
spec:
name: aws
path: cloudquery/aws
registry: cloudquery
version: "v33.20.0"
tables: ["aws_s3_buckets"]
destinations: ["postgresql"]kind: destination
spec:
name: postgresql
path: cloudquery/postgresql
registry: cloudquery
version: "v8.14.7"
spec:
connection_string: ${PG_CONNECTION_STRING}With these two files, we can run a sync using:
cloudquery sync aws.yml postgresql.ymlAdding another source
To add a gcp source as well, create its configuration in a new file:
kind: source
spec:
name: gcp
path: cloudquery/gcp
registry: cloudquery
version: "v22.0.0"
tables: ["gcp_storage_buckets"]
destinations: ["postgresql"]And now sync both aws and gcp to postgresql in a single command:
cloudquery sync aws.yml gcp.yml postgresql.ymlExample using one file
You can also combine sources and destinations into a single file by separating the sections with ---:
kind: source
spec:
name: aws
path: cloudquery/aws
registry: cloudquery
version: "v33.20.0"
tables: ["aws_s3_buckets"]
destinations: ["postgresql"]
---
kind: destination
spec:
name: postgresql
path: cloudquery/postgresql
registry: cloudquery
version: "v8.14.7"
spec:
connection_string: ${PG_CONNECTION_STRING}Now we can run a sync using:
cloudquery sync config.ymlThis example shows only two integration sections, but a configuration file is allowed to contain any number of integration sections.
Source spec fields
These are the top-level fields available under spec for a kind: source block. For full documentation including examples, see Source Integration Reference.
| Field | Type | Default | Description |
|---|---|---|---|
name | string | — | Required. Unique name for this source. |
path | string | — | Required. Plugin path, e.g. cloudquery/aws. |
registry | string | cloudquery | Plugin registry: cloudquery, github, local, grpc, docker. |
version | string | — | Plugin version (required for cloudquery/github registries). |
tables | list | — | Required. Tables to sync. Accepts wildcards, e.g. ["aws_ec2_*"]. |
skip_tables | list | [] | Tables to skip. Useful with wildcards. Skipping a parent also skips its children. |
skip_dependent_tables | bool | true | When true, dependent tables are only synced if explicitly listed in tables. Set to false to restore pre-v6.0.0 behavior where matched parents pulled in all descendants. |
destinations | list | — | Required. Names of destination plugins to write to. |
backend_options | object | — | Enables incremental (stateful) syncs. See Managing Incremental Tables. |
deterministic_cq_id | bool | false | When true, generates _cq_id as a hash of primary keys instead of a random UUID. Useful for deduplication across syncs. |
otel_endpoint | string | — | (Preview) OpenTelemetry OTLP/HTTP endpoint for sync traces. See Monitoring. |
otel_endpoint_insecure | bool | false | (Preview) Skip TLS verification for the OTel endpoint. |
spec | object | — | Plugin-specific configuration. See each plugin’s documentation. |
docker_registry_auth_token | string | — | Auth token for private Docker registries. See Source Integration Reference. |
backend_options fields
The backend_options object enables state tracking for incremental syncs:
backend_options:
table_name: cq_state_aws # table to store incremental state
connection: "@@plugins.postgresql.connection" # destination to use for state| Field | Type | Description |
|---|---|---|
table_name | string | Required. Table name for storing incremental state. |
connection | string | Required. @@plugins.<name>.connection or a gRPC address. |
See Managing Incremental Tables for full examples.
Destination spec fields
These are the top-level fields available under spec for a kind: destination block. For full documentation, see Destination Integration Reference.
| Field | Type | Default | Description |
|---|---|---|---|
name | string | — | Required. Unique name for this destination. |
path | string | — | Required. Plugin path, e.g. cloudquery/postgresql. |
registry | string | cloudquery | Plugin registry: cloudquery, github, local, grpc, docker. |
version | string | — | Plugin version (required for cloudquery/github registries). |
write_mode | string | overwrite-delete-stale | How to handle existing rows: overwrite-delete-stale, overwrite, append. |
migrate_mode | string | safe | Schema migration behavior: safe (non-destructive) or forced (allows column drops). |
pk_mode | string | default | Primary key behavior: default (use plugin PKs) or cq-id-only (use _cq_id only). |
sync_group_id | string | — | Groups multiple syncs together. Supports placeholders: {{SYNC_ID}}, {{YEAR}}, {{MONTH}}, {{DAY}}, {{HOUR}}, {{MINUTE}}. |
send_sync_summary | bool | false | When true, writes a sync summary row to the cloudquery_sync_summaries table after each sync. |
transformers | list | [] | Names of transformer plugins to run on data before writing. See Transformer Integrations. |
spec | object | — | Plugin-specific configuration. See each plugin’s documentation. |
docker_registry_auth_token | string | — | Auth token for private Docker registries. See Destination Integration Reference. |
sync_group_id example
sync_group_id is useful for identifying which sync run produced a set of rows, or for partitioning data by time:
kind: destination
spec:
name: postgresql
# ...
sync_group_id: "aws-sync-{{YEAR}}-{{MONTH}}-{{DAY}}"Removed source fields
The following source fields were removed in CLI v3.6.0. They no longer have any effect and will cause a validation error if used. Remove them from any older configuration files.
| Removed field | Replacement |
|---|---|
concurrency | Configure concurrency at the plugin level. See the plugin’s own documentation. |
table_concurrency | Configure concurrency at the plugin level. |
resource_concurrency | Configure concurrency at the plugin level. |
scheduler | Configure the scheduler at the plugin level. See the plugin’s own documentation. |
backend | Use backend_options instead. |
backend_spec | Use backend_options instead. |
Additional configuration features
- Transformer integrations: In addition to
kind: sourceandkind: destination, you can also configurekind: transformerintegrations that process data between sources and destinations. See Transformer Integrations for details. - Registry options: The
registryfield supports multiple values beyondcloudquery, includinglocal,grpc, anddocker. See Source Integrations for all available options. - Variable substitution: Configuration files support environment variables (
${ENV_VAR}), file contents (${file:./path}), and time-based values (${time:5 days ago}). See Environment Variables for details.
Next Steps
- AWS to PostgreSQL guide - end-to-end walkthrough using the AWS and PostgreSQL configuration shown above
- Syncs - understand how syncs work, including write modes and incremental tables
- Source integration reference - full configuration options for source specs
- Destination integration reference - full configuration options for destination specs
- Transformer integration reference - configure pre-load transformations
- Managing Incremental Tables - stateful syncs with
backend_options - Environment Variables - variable substitution, file references, and time-based values
cloudquery syncCLI reference - command-line options and flagscloudquery validate-configCLI reference - validate configuration files before running a sync- Deployment guides - run CloudQuery in Docker, Kubernetes, ECS, GitHub Actions, and more