Explore the tools that Hazelcast offers for distributed computing on cluster members.
Distributed computing is the process of running computational tasks on different cluster members. With distributed computing, computations are faster thanks to the following advantages:
Leveraging the combined processing power of a cluster.
Reducing network hops by running computations on the cluster that owns the data
Hazelcast offers the following tools for distributed computing, depending on your use case:
|All these tools require you to submit Java code to cluster members.|
An entry processor is a good option if you perform bulk processing on a map. Usually you perform a loop of keys, executing
map.get(key), mutating the value and finally putting the entry back in the map using
map.put(key,value). If you perform this process from a client or from a member where the keys do not exist, you effectively perform two network hops for each update: The first to retrieve the data and the second to update the mutated value.
If you are doing the process described above, you should consider using entry processors. An entry processor executes read and update operations on the member where the data resides. This eliminates the costly network hops.
|Entry processors are meant to process a single entry per call. Processing multiple entries and data structures in an entry processor is not supported as it may result in deadlocks. To process multiple entries at once, use a pipeline.|
The executor service is ideal for running arbitrary Java code on your cluster members.
Pipelines are a good fit when you want to perform processing that involves multiple entries (aggregations, joins, etc.), or involves multiple computing steps to be made parallel. Pipelines allow you to update maps as a result of your computation, using an entry processor sink.