CloudQuery News
Announcing the Java SDK for CloudQuery Integration Development
We're excited to announce the first release of a Java SDK for CloudQuery integration development! This SDK provides a high-level toolkit for developing CloudQuery plugins in Java.
Background #
CloudQuery is designed with a plugin-based architecture and uses Apache Arrow over gRPC for communication between plugins. Source and destination integrations are independent of one another, and this architecture allows integrations to be written in different languages but still communicate with one another.
Originally, we only provided an SDK for writing integrations in Go only, but that is changing now. Recently, we released the CloudQuery SDK for Python, the CloudQuery SDK for JavaScript, and now we are excited for the next language in line: Java!
Features #
Plugin Server #
The most basic functionality provided by the Java SDK is to start a gRPC plugin server that supports all the flags expected by the CloudQuery CLI. This allows you to write an integration in Java and run it using the same command line interface as any other integration.
The following example shows how to create a integration server that runs an integration called
MyPlugin
:import io.cloudquery.server.PluginServe;
public class MainClass {
public static void main(String[] args) {
MyPlugin plugin = new MyPlugin();
PluginServe pluginServe = PluginServe.builder().args(args).plugin(plugin).build();
int exitCode = pluginServe.Serve();
System.exit(exitCode);
}
}
Plugin Class #
A CloudQuery Java source plugin, such as the
MyPlugin
above, should extend the io.cloudquery.plugin.Plugin
and needs to implement the following three methods: newClient
, tables
and sync
.The
newClient
method is called when the integration is started, and is where you can do any initialization work.The
tables
method should return a list of tables that the integration supports.The
sync
method is called when a table needs to be synced. This is where the SDK scheduler can be used to manage the syncing of all the supported tables.Check out our Bitbucket plugin for an example implementation.
Multi-threaded Scheduler #
The scheduler's main responsibilities are to manage concurrent execution of requests and the order in which tables are synced to avoid dependency issues. It also places limits on the number of concurrent requests and memory usage.
To invoke the scheduler, the
sync
method of a integration should pass a list of its tables and options to the scheduler. The scheduler will take care of the rest. Here is an example from the CloudQuery Bitbucket integration:@Override
public void sync(
List<String> includeList,
List<String> skipList,
boolean skipDependentTables,
boolean deterministicCqId,
BackendOptions backendOptions,
StreamObserver<Sync.Response> syncStream)
throws SchemaException, ClientNotInitializedException {
if (this.client == null) {
throw new ClientNotInitializedException();
}
List<Table> filtered = Table.filterDFS(allTables, includeList, skipList, skipDependentTables);
Scheduler.builder()
.client(client)
.tables(filtered)
.syncStream(syncStream)
.deterministicCqId(deterministicCqId)
.logger(getLogger())
.concurrency(spec.getConcurrency())
.build()
.sync();
}
Docker for Cross-Platform Distribution #
To support cross-platform packaging of Java integrations, we introduced a new
docker
registry type to the CloudQuery CLI in v3.12.0
. Where Go-based integrations are downloaded as binaries from GitHub releases, Java integrations are downloaded as Docker images from the specified Docker registry. This allows CloudQuery to support multiple platforms, and also makes it easier to distribute integrations that have dependencies on external libraries.Start Creating Your Own Plugin #
Want to start writing your own integration? Here is our guide to get you started: https://www.cloudquery.io/docs/developers/creating-new-plugin/java-source.
Feedback #
We'd love to hear your feedback on the Java SDK. If you have any questions, comments, or suggestions, please feel free to reach out to us on the CloudQuery Community or GitHub.
Ready to dive deeper? Contact CloudQuery here or join the CloudQuery Community to connect with other users and experts. You can also try out CloudQuery locally with our quick start guide or explore the CloudQuery Platform (currently in beta) for a more scalable solution.
Want help getting started? Join the CloudQuery community to connect with other users and experts, or message our team directly here if you have any questions.
Written by Michal Brutvan
Michal is CloudQuery's senior product manager and has responsibility for new features and CloudQuery's product roadmap. He has had a wealth of product ownership roles and prior to that, worked as a software engineer.