Running Distributed Queries
Explore the tools that Hazelcast offers for running distributed queries.
What is a Distributed Query?
In a distributed system such as Hazelcast, data is partitioned across different members in the cluster. As a result, to query that data you have two options:
-
Request all data from members and iterate over it locally until you find what you’re looking for.
-
Build a distributed query that each member can run so that you receive only the data you want.
Distributed queries allow you to request filtered data from members or external data sources without having to receive it all and iterate over it locally.
Available Tools
Hazelcast offers the following tools for running distributed queries, depending on your use case:
-
SQL: Use SQL syntax to query your cluster for data in map entries or query external systems such as Apache Kafka.
-
Predicates API: Use a client API to query your cluster for data in map entries.
In addition to the Java-based Predicates API, Hazelcast also supports SQL statements and functions, and streaming SQL queries.
SQL | Predicates API | |
---|---|---|
Can query data in external sources |
Yes |
No |
Can query nested object fields |
No |
Yes |
Can query JSON data in map entries |
Yes |
Yes |
Supported Hazelcast clients |
Java Node.js Python |
All clients |
Supported SQL Queries
Hazelcast supports the following queries with SQL.
Query | Description |
---|---|
Ad-hoc queries |
Query large datasets either in one or multiple systems and/or run aggregations on them to get deeper insights. |
Streaming queries, also known as continuous queries. |
Keep an open connection to a streaming data source and run a continuous query to get near real-time updates. |
Federated queries |
Query different datasets such as Kafka topics and Hazelcast maps, using a single query. Normally, querying in SQL is database or dataset-specific. However, with mappings, you can pull information from different sources to present a more complete picture. |
Supported Predicates Queries
With the Predicates API, you can use the following queries:
-
Ad-hoc queries (also known as point queries or OLTP queries)
-
Batch querying (also known as OLAP queries)
If you need to do streaming or federated queries, use SQL. |