CloudQuery News

Announcing the CloudQuery BigQuery Source Integration

Michal Brutvan

Michal Brutvan

Introducing the new CloudQuery BigQuery source plugin!
BigQuery is a popular, fully managed, serverless data warehouse designed for large datasets. We have now added a BigQuery Source Integration that enables syncing tables and views from a dataset to any supported CloudQuery destination.
Data teams often use BigQuery for its scalability, ease of use, and integration with other Google Cloud services. It is commonly used alongside our GCP Source Integration. These teams utilize BigQuery for various data analytics tasks such as ad-hoc queries, data exploration, and large-scale data processing. However, sometimes you need to do an advanced analysis, run transformations, or consolidate the data in a different database. The new source plugin enables just that.
On top of syncing the individual tables and views from the source dataset, the new plugin supports queries at the source, thus enabling on-the-fly transformation of the data.

Example Configuration #

Set up CloudQuery locally, or use the CloudQuery Platform.
To sync a whole dataset to a PostgreSQL database, use the default configuration. Specify your Project ID, Dataset ID (name), and the connection string to the database. For other options and details on how to authenticate with BigQuery, see BigQuery Source Integration Documentation.
kind: source
spec:
  name: bigquery
  path: cloudquery/bigquery
  registry: cloudquery
  version: "v1.0.0"
  tables: ["*"]
  destinations: ["postgresql"]
  spec:
    project_id: ${PROJECT_ID}
    dataset_id: ${DATASET_ID}
---
kind: destination
spec:
  name: "postgresql"
  path: "cloudquery/postgresql"
  registry: "cloudquery"
  version: "v7.6.0"
  spec:
    connection_string: ${POSTGRESQL_CONNECTION_STRING}

Specifying a query #

In this example, assume a dataset usage with table usage_events having columns customer_id, timestamp, and miles. The individual entries represent a distance driven by a customer during a single driving session started at the timestamp.
We would like to get only the summary of the distance driven last month grouped by the customer and sync it to a PostgreSQL database. Here's what the configuration looks like with the SQL query included:
kind: source
spec:
  name: bigquery
  path: cloudquery/bigquery
  registry: cloudquery
  version: "v1.0.0"
  tables: ["*"]
  destinations: ["postgresql"]
  spec:
    project_id: ${PROJECT_ID}
    dataset_id: ${DATASET_ID}
    queries:
      - name: monthly_usage
      - query: >
          select customer_id, sum(miles) as usage from billing.usage_events
          where timestamp > current_timestamp - interval 30 day
          group by customer_id
---
kind: destination
spec:
  name: "postgresql"
  path: "cloudquery/postgresql"
  registry: "cloudquery"
  version: "v7.6.0"
  spec:
    connection_string: ${POSTGRESQL_CONNECTION_STRING}
This configuration will not sync the entire dataset. Instead, it will create only the monthly_usage table and populate it with the results of the query.
You can include as many queries as you want in one sync configuration.

Find more resources in Hub #

With this plugin, you can sync data from BigQuery to a destination of your choice. However, you can get data from more sources than just BigQuery. Check out our other database source integrations on the CloudQuery Hub.
If you'd like to start using the BigQuery source integration, download CloudQuery today. You can also discuss use cases and get feedback from other CloudQuery users in the CloudQuery Community.

Ready to dive deeper? Contact CloudQuery here or join the CloudQuery Community to connect with other users and experts.
You can also try out CloudQuery locally with our quick start guide or explore the CloudQuery Platform (currently in beta) for a more scalable solution.
Michal Brutvan

Written by Michal Brutvan

Michal is CloudQuery's senior product manager and has responsibility for new features and CloudQuery's product roadmap. He has had a wealth of product ownership roles and prior to that, worked as a software engineer.

Turn cloud chaos into clarity

Find out how CloudQuery can help you get clarity from a chaotic cloud environment with a personalized conversation and demo.

Join our mailing list

Subscribe to our newsletter to make sure you don't miss any updates.

Legal

© 2024 CloudQuery, Inc. All rights reserved.

We use tracking cookies to understand how you use the product and help us improve it. Please accept cookies to help us improve. You can always opt out later via the link in the footer.