How to run CloudQuery syncs within Argo Workflows
Setting up your Argo Workflow #
Setting up MongoDB #
host.docker.internal
.How to configure CloudQuery #
cloudquery-source.yml
with these contents. Information on how to configure each CloudQuery Plugin can be found on the Plugin pageapiVersion: v1
kind: ConfigMap
metadata:
name: cloudquery-source
data:
source.yaml: |
kind: source
spec:
name: "xkcd"
path: "cloudquery/xkcd"
version: "v1.0.6"
tables: ['*']
destinations:
- mongodb
- To use the XKCD source plugin with version v1.0.6 from the CloudQuery Hub as source.
- To sync all available source tables.
- To sync them to a “MongoDB” destination, which will be specified on a separate YAML file.
- Apply the ConfigMap with
kubectl apply -n argo -f cloudquery-source.yml
- Add a file called
cloudquery-destination.yml
with these contents
apiVersion: v1
kind: ConfigMap
metadata:
name: cloudquery-destination
data:
destination.yaml: |
kind: destination
spec:
name: "mongodb"
path: "cloudquery/mongodb"
registry: "cloudquery"
version: "v2.5.6"
spec:
connection_string: "mongodb://host.docker.internal:27017"
database: "xkcd"
- To use the MongoDB destination plugin with version v2.5.6 from the CloudQuery Hub as the destination.
- To connect to MongoDB via the host’s special hostname, and store the data onto a database called
xkcd
.
- Apply the ConfigMap with
kubectl apply -n argo -f cloudquery-destination.yml
How to authenticate to CloudQuery with an API key #
- Follow these steps to generate a CloudQuery API key: https://docs.cloudquery.io/docs/deployment/generate-api-key
- Create a Secret with the API key
apiVersion: v1
kind: Secret
metadata:
name: cloudquery-apikey
type: Opaque
data:
CLOUDQUERY_API_KEY: ***REDACTED***
- Apply the secret with
kubectl apply -n argo -f cloudquery-apikey.yml
base64
the API key)How to run CloudQuery within an Argo Workflow #
apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
generateName: cloudquery-sync-
spec:
entrypoint: cloudquery-sync
volumes:
- name: config
projected:
sources:
- configMap:
name: cloudquery-source
- configMap:
name: cloudquery-destination
templates:
- name: cloudquery-sync
inputs:
parameters:
- name: version
value: "v6.5.0"
container:
image: ghcr.io/cloudquery/cloudquery:latest
args: ["sync", "/mnt/config"]
env:
- name: CLOUDQUERY_API_KEY
valueFrom:
secretKeyRef:
name: cloudquery-apikey
key: CLOUDQUERY_API_KEY
volumeMounts:
- name: config
mountPath: /mnt/config
- Hook up the source & destination ConfigMaps and the Secret we previously defined.
- Start up a CloudQuery docker image.
- Run the CloudQuery sync, which will send the XKCD comics to MongoDB.
kubectl -n argo port-forward service/argo-server 2746:2746
localhost:2746
in the browser, you can see the Workflows UI with no workflows created yet.argo
namespace. Visit /workflows/argo
.Running CloudQuery within the Argo Workflow #
argo
CLI:argo submit -n argo workflow.yml
The final result #
mongosh
to inspect the newly created xkcd
database in your host computer:- Explore our extensive collection of plugins (https://hub.cloudquery.io/plugins/source).
- Find the right plugin for the technology stack your team works with.
- Simplify your ELT implementations by integrating CloudQuery syncs into your existing Argo Workflows.
Written by Mariano Gappa
Mariano is a software engineer working at CloudQuery with 15 years of experience in the industry. His speciality is in improving performance and his work has reduced sync times and significantly improved CloudQuery's performance.