This is a prerelease version.

Distributed Data Structures

Hazelcast offers distributed implementations of many common data structures. For each of the client languages, Hazelcast mimics as closely as possible the natural interface of the structure. So, for example in Java, the map follows java.util.Map semantics. In the descriptions below, we mention each structure’s Java equivalent interface. All of these structures are usable from Java, .NET, C++, Node.js, Python, and Go.

  • Standard utility collections

    • Map is the distributed implementation of java.util.Map. It lets you read from and write to a Hazelcast map with methods such as get and put.

    • Queue is the distributed implementation of java.util.concurrent.BlockingQueue. You can add an item in one member and remove it from another one.

    • Ringbuffer is implemented for reliable eventing system.

    • Set is the distributed and concurrent implementation of java.util.Set. It does not allow duplicate elements and does not preserve their order.

    • List is similar to Hazelcast Set. The only difference is that it allows duplicate elements and preserves their order.

    • Multimap is a specialized Hazelcast map. It is a distributed data structure where you can store multiple values for a single key.

    • Replicated Map does not partition data. It does not spread data to different cluster members. Instead, it replicates the data to all members.

    • Cardinality Estimator is a data structure which implements Flajolet’s HyperLogLog algorithm.

  • Topic

    • Topic is the distributed mechanism for publishing messages that are delivered to multiple subscribers. It is also known as the publish/subscribe (pub/sub) messaging model.

    • Reliable Topic uses the same interface as Hazelcast Topic, except it is backed up by the Ringbuffer data structure.

  • Concurrency utilities

    • FencedLock is the distributed implementation of java.util.concurrent.locks.Lock. When you use lock, the critical section that Hazelcast Lock guards is guaranteed to be executed by only one thread in the entire cluster.

    • ISemaphore is the distributed implementation of java.util.concurrent.Semaphore. When performing concurrent activities, semaphores offer permits to control the thread counts.

    • IAtomicLong is the distributed implementation of java.util.concurrent.atomic.AtomicLong. Most of AtomicLong’s operations are available. However, these operations involve remote calls and hence their performances differ from AtomicLong, due to being distributed.

    • IAtomicReference is the distributed implementation of java.util.concurrent.atomic.AtomicReference. When you need to deal with a reference in a distributed environment, you can use Hazelcast IAtomicReference.

    • FlakeIdGenerator is used to generate cluster-wide unique identifiers.

    • ICountdownLatch is the distributed implementation of java.util.concurrent.CountDownLatch. Hazelcast CountDownLatch is a gate keeper for concurrent activities. It enables the threads to wait for other threads to complete their operations.

    • PN counter is a distributed data structure where each Hazelcast instance can increment and decrement the counter value and these updates are propagated to all replicas.

  • Event Journal is a distributed data structure that stores the history of mutation actions on map or cache.

How Data is Partitioned

Hazelcast has two types of distributed objects in terms of their partitioning strategies:

  1. Data structures where each partition stores a part of the instance, namely partitioned data structures.

  2. Data structures where a single partition stores the whole instance, namely non-partitioned data structures.

The following are the partitioned Hazelcast data structures:

  • Map

  • MultiMap

  • Cache (Hazelcast JCache implementation)

  • Event Journal

The following are the non-partitioned Hazelcast data structures:

  • Queue

  • Set

  • List

  • Ringbuffer

  • FencedLock

  • ISemaphore

  • IAtomicLong

  • IAtomicReference

  • FlakeIdGenerator

  • ICountdownLatch

  • Cardinality Estimator

  • PN Counter

Loading and Destroying a Distributed Object

Hazelcast offers a get method for most of its distributed objects. To load an object, first create a Hazelcast instance and then use the related get method on this instance. Following example code snippet creates an Hazelcast instance and a map on this instance.

HazelcastInstance hazelcastInstance = Hazelcast.newHazelcastInstance();
Map<Integer, String> customers = hazelcastInstance.getMap( "customers" );

As to the configuration of distributed object, Hazelcast uses the default settings from the file hazelcast.xml that comes with your Hazelcast download. Of course, you can provide an explicit configuration in this XML or programmatically according to your needs. See the Understanding Configuration section.

Note that, most of Hazelcast’s distributed objects are created lazily, i.e., a distributed object is created once the first operation accesses it.

If you want to use an object you loaded in other places, you can safely reload it using its reference without creating a new Hazelcast instance (customers in the above example).

To destroy a Hazelcast distributed object, you can use the method destroy. This method clears and releases all resources of the object. Therefore, you must use it with care since a reload with the same object reference after the object is destroyed creates a new data structure without an error. See the following example code where one of the queues are destroyed and the other one is accessed.

        HazelcastInstance hz1 = Hazelcast.newHazelcastInstance();
        HazelcastInstance hz2 = Hazelcast.newHazelcastInstance();
        IQueue<String> q1 = hz1.getQueue("q");
        IQueue<String> q2 = hz2.getQueue("q");
        q1.add("foo");
        System.out.println("q1.size: "+q1.size()+ " q2.size:"+q2.size());
        q1.destroy();
        System.out.println("q1.size: " + q1.size() + " q2.size:" + q2.size());

If you start the Member above, the output is as shown below:

q1.size: 1 q2.size:1
q1.size: 0 q2.size:0

As you see, no error is generated and a new queue resource is created.

Hazelcast is designed to create any distributed data structure whenever it is accessed, i.e., whenever a call is made to the data structure. Therefore, keep in mind that a data structure is recreated when you perform an operation on it even after you have destroyed it.

Controlling Partitions

Hazelcast uses the name of a distributed object to determine which partition it will be put. Let’s load two queues as shown below:

HazelcastInstance hazelcastInstance = Hazelcast.newHazelcastInstance();
IQueue q1 = hazelcastInstance.getQueue("q1");
IQueue q2 = hazelcastInstance.getQueue("q2");

Since these queues have different names, they will be placed into different partitions. If you want to put these two into the same partition, you use the @ symbol as shown below:

HazelcastInstance hazelcastInstance = Hazelcast.newHazelcastInstance();
IQueue q1 = hazelcastInstance.getQueue("q1@foo");
IQueue q2 = hazelcastInstance.getQueue("q2@foo");

Now, these two queues will be put into the same partition whose partition key is foo. Note that you can use the method getPartitionKey to learn the partition key of a distributed object. It may be useful when you want to create an object in the same partition of an existing object. See its usage as shown below:

String partitionKey = q1.getPartitionKey();
IQueue q3 = hazelcastInstance.getQueue("q3@"+partitionKey);

Common Features of all Hazelcast Data Structures

  • If a member goes down, its backup replica (which holds the same data) dynamically redistributes the data, including the ownership and locks on them, to the remaining live members. As a result, there will not be any data loss.

  • There is no single cluster master that can be a single point of failure. Every member in the cluster has equal rights and responsibilities. No single member is superior. There is no dependency on an external 'server' or 'master'.

Example Distributed Object Code

Here is an example of how you can retrieve existing data structure instances (map, queue, set, topic, etc.) and how you can listen for instance events, such as an instance being created or destroyed.

        ExampleDOL example = new ExampleDOL();
        Config config = new Config();

        HazelcastInstance hazelcastInstance = Hazelcast.newHazelcastInstance(config);
        hazelcastInstance.addDistributedObjectListener(example);

        Collection<DistributedObject> distributedObjects = hazelcastInstance.getDistributedObjects();
        for (DistributedObject distributedObject : distributedObjects) {
            System.out.println(distributedObject.getName());
        }
    }

    @Override
    public void distributedObjectCreated(DistributedObjectEvent event) {
        DistributedObject instance = event.getDistributedObject();
        System.out.println("Created " + instance.getName());
    }

    @Override
    public void distributedObjectDestroyed(DistributedObjectEvent event) {
        DistributedObject instance = event.getDistributedObject();
        System.out.println("Destroyed " + instance.getName());
    }