Running Distributed Queries

Explore the tools that Hazelcast offers for running distributed queries.

What is a Distributed Query?

In a distributed system such as Hazelcast, data is partitioned across different members in the cluster. As a result, to query that data you have two options:

  1. Request all data from members and iterate over it locally until you find what you’re looking for.

  2. Build a distributed query that each member can run so that you receive only the data you want.

Distributed queries allow you to request filtered data from members or external data sources without having to receive it all and iterate over it locally.

Available Tools

Hazelcast offers the following tools for running distributed queries, depending on your use case:

  • SQL: Use SQL syntax to query your cluster for data in map entries or query external systems such as Apache Kafka.

  • Predicates API: Use a client API to query your cluster for data in map entries.

    In addition to the Java-based Predicates API, Hazelcast also supports SQL statements and functions, and streaming SQL queries.
Table 1. Comparison of SQL and the Predicates API
SQL Predicates API

Can query data in external sources

Yes

No

Can query nested object fields

Yes

Yes

Can query nested object fields with cyclic dependencies

No

Yes

Can query JSON data in map entries

Yes

Yes

Supported Hazelcast clients

Java

Node.js

Python

All clients

Supported SQL Queries

Hazelcast supports the following queries with SQL.

Query Description

Ad-hoc queries

Query large datasets either in one or multiple systems and/or run aggregations on them to get deeper insights.

Streaming queries, also known as continuous queries.

Keep an open connection to a streaming data source and run a continuous query to get near real-time updates.

Federated queries

Query different datasets such as Kafka topics and Hazelcast maps, using a single query. Normally, querying in SQL is database or dataset-specific. However, with mappings, you can pull information from different sources to present a more complete picture.

Supported Predicates Queries

With the Predicates API, you can use the following queries:

  • Ad-hoc queries (also known as point queries or OLTP queries)

  • Batch querying (also known as OLAP queries)

If you need to do streaming or federated queries, use SQL.

To learn more about SQL in Hazelcast, see the following resources:

To learn more about the Predicates API, see the following resources: