Azure Blob Storage Plugin documentation

Official

Azure Blob Storage

This destination plugin lets you sync data from a CloudQuery source to remote Azure Blob Storage storage in various formats such as CSV, JSON and Parquet

Make sure to sign up and run cloudquery login before running your first sync.

Publisher

cloudquery

Repositorygithub.com

Latest version

v4.0.1

Type

Destination

Platforms

Date Published

May 14, 2024

Price

Free

Documentation Changelog

Overview

Azure Blob Storage Destination Plugin

This destination plugin lets you sync data from a CloudQuery source to remote Azure Blob Storage storage in various formats such as CSV, JSON and Parquet.

Authentication

The plugin needs to be authenticated with your Azure account in order to fetch information about your cloud setup.

You can either authenticate with az login (when running locally), or by using a "service principal" and exporting environment variables (appropriate for automated deployments).

You can find out more about authentication with Azure at Azure's documentation for the Go SDK.

Example

This example configures an Azure blob storage destination, to create CSV files in https://cqdestinationazblob.blob.core.windows.net/test/path/to/files.

The (top level) spec section is described in the Destination Spec Reference.

kind: destination
spec:
  name: "azblob"
  path: "cloudquery/azblob"
  registry: "cloudquery"
  version: "v4.0.1"
  spec:
    storage_account: "cqdestinationazblob"
    container: "test"
    path: "path/to/files"

    format: "csv" # options: parquet, json, csv
    format_spec:
      # CSV-specific parameters:
      # delimiter: ","
      # skip_header: false

    # Optional parameters
    # compression: "" # options: gzip
    # no_rotate: false
    # batch_size: 10000
    # batch_size_bytes: 52428800 # 50 MiB
    # batch_timeout: 30s

The Azure Blob destination utilizes batching, and supports batch_size, batch_size_bytes and batch_timeout options (see below).

Azure Blob Spec

This is the (nested) spec used by the Azure blob destination Plugin.

storage_account (string) (required)
Storage account where to sync the files.
container (string) (required)
Storage container inside the storage account where to sync the files.
path (string) (required)
Path to where the files will be uploaded in the above bucket.
no_rotate (boolean) (optional) (default: false)
If set to true, the plugin will write to one file per table. Otherwise, for every batch a new file will be created with a different .<UUID> suffix.
format (string) (required)
Format of the output file. Supported values are csv, json and parquet.
format_spec (format_spec) (optional)
Optional parameters to change the format of the file.
compression (string) (optional) (default: empty)
Compression algorithm to use. Supported values are empty or gzip. Not supported for parquet format.
batch_size (integer) (optional) (default: 10000)
Number of records to write before starting a new object.
batch_size_bytes (integer) (optional) (default: 52428800 (50 MiB))
Number of bytes (as Arrow buffer size) to write before starting a new object.
batch_timeout (duration) (optional) (default: 30s (30 seconds))
Maximum interval between batch writes.

format_spec

delimiter (string) (optional) (default: ,)
Character that will be used as want to use as the delimiter if the format type is csv.
skip_header (boolean) (optional) (default: false)
Specifies if the first line of a file should be the headers (when format is csv).