Performance Tips
To achieve good performance in your Hazelcast deployment, it is crucial to tune your production environment. This section provides guidelines for tuning the performance though we also recommend to run performance and stress tests to evaluate the application performance.
Basic Recommendations
-
Eight cores per Hazelcast server instance
-
Minimum of 8 GB RAM per Hazelcast member (if not using the High-Density Memory Store)
-
Dedicated NIC for each Hazelcast member
-
Linux—any distribution
-
Run all members within the same subnet
-
Attach all members to the same network switch
Using Operation Threads Efficiently
By default, Hazelcast uses the machine’s core count to determine the number of operation threads. Creating more operation threads than this core count is highly unlikely to lead to an improved performance since there will be more context switching, more thread notification, and so on.
Especially if you have a system that does simple operations like put and get, it is better to use a lower thread count than the number of cores. The reason behind the increased performance by reducing the core count is that the operations executed on the operation threads normally execute very fast and there can be a very significant amount of overhead caused by thread parking and unparking. If there are fewer threads, a thread needs to do more work, will block less and therefore needs to be notified less.
Avoiding Random Changes
Tweaking can be very rewarding because significant performance improvements are possible. By default, Hazelcast tries to behave at its best for all situations, but this doesn’t always lead to the best performance. So if you know what you are doing and what to look for, it can be very rewarding to tweak. However, it is also important that tweaking should be done with proper testing to see if there is actually an improvement. Tweaking without proper benchmarking is likely going to lead to confusion and could cause all kinds of problems. In case of doubt, we recommend not to tweak.
Creating the Right Benchmark Environment
When benchmarking, it is important that the benchmark reflects your production environment. Sometimes with a calculated guess, a representative smaller environment can be set up; but if you want to use the benchmark statistics to infer how your production system is going to behave, you need to make sure that you get as close as your production setup as possible. Otherwise, you are at risk of spotting the issues too late or focusing on the things which are not relevant.
Hardware
Uniform Hardware:
To maximize the efficiency and performance of Hazelcast, it’s crucial to ensure that all cluster members are equipped with equal CPU, memory, and network resources. This uniformity prevents any single slow member from impeding the overall cluster performance. One effective strategy in achieving this is to allocate dedicated machine resources exclusively for Hazelcast services.
By providing properly sized hardware or virtual hardware to each member, Hazelcast ensures that all members have ample resources without competing with other processes or services. This approach allows Hazelcast to distribute load evenly across all members and maintain predictable performance. In heterogeneous clusters where some machines are more powerful than others, weaker members can create bottlenecks, leading to underutilization of stronger members. Therefore, for optimal performance, it’s advisable to use equivalent hardware for all Hazelcast members.
Minimal Recommendation:
Hazelcast is a lightweight framework and is reported to run well on devices such as the Raspberry Pi Zero (1GHz single-core CPU, 512MB RAM).
Recommended Configuration:
We suggest at least 8 CPU cores or equivalent per member, as well as running a single Hazelcast member for each host.
For environments with either fewer or more cores than 8 CPU, we recommend enabling Thread-Per-Core (TPC). For more info, see Thread-Per-Core (TPC). |
As a starting point for data-intensive operations, consider machines such as AWS c5.2xlarge with:
-
8 CPU cores
-
16 GB RAM
-
10 Gbps network
Single Member per Machine:
A Hazelcast member assumes it is alone on a machine, so we recommend not running multiple Hazelcast members on a machine. Having multiple members on a single machine is likely to result in worse performance than running a single member, since there will be more context switching, less batching, and so on. So unless it is proven that running multiple members on each machine does give a better performance/behavior in your particular setup, it is best to run a single member per machine.
CPU:
Hazelcast can use hundreds of CPU cores efficiently by exploiting data and task parallelism. Adding more CPU can therefore help with scaling the CPU-bound computations. If you’re using jobs and pipelines, read about the Execution model to understand how Hazelcast makes the computation parallel and design your pipelines according to it.
By default, Hazelcast uses all available CPU. Starting two Hazelcast instances on one machine therefore doesn’t bring any performance benefit as the instances would compete for the same CPU resources.
Don’t rely just on CPU usage when benchmarking your cluster. Simulate production workload and measure the throughput and latency instead. The task manager of Hazelcast can be configured to use the CPU aggressively. As an example, see this benchmark: the CPU usage was close to 20% with just 1000 events/s. At 1m items/s the CPU usage was 100% even though Jet still could push around 5 million items/s on that machine.
Disk:
Hazelcast is an in-memory framework. Cluster disks aren’t involved in regular operations except for logging and thus are not critical for the cluster performance. There are optional features of Hazelcast (such as Persistence and CP Persistence) which can use disk space, but even when they are in use a Hazelcast system is primarily in-memory.
Consider using more performant disks if you use the following Hazelcast features:
-
The file connector for reading or writing to files on the cluster’s file system.
-
Persistence for saving map data to disk.
-
CP Persistence for strong resiliency guarantees when using the CP Subsystem.
Operating System
Hazelcast works in many operating environments and some environments have unique considerations. These are highlighted below.
As a general suggestion, we recommend turning off the swapping at operating system level; see Disable Swap Usage.
Hazelcast is certified for Solaris SPARC.
However, the following modules are not supported for the Solaris operating system:
-
hazelcast-jet-grpc
-
hazelcast-jet-protobuf
-
hazelcast-jet-python
Disable Transparent Huge Pages (THP):
Transparent Huge Pages (THP) is the Linux Memory Management feature which aims to improve the application performance by using the larger memory pages. In most of the cases it works fine but for databases and in-memory data grids it usually causes a significant performance drop. Since it’s enabled on most of the Linux distributions, we do recommend disabling it when you run Hazelcast.
Use the following command to check if it’s enabled:
cat /sys/kernel/mm/transparent_hugepage/enabled
cat /sys/kernel/mm/transparent_hugepage/defrag
Or an alternative command if you run RHEL:
cat /sys/kernel/mm/redhat_transparent_hugepage/enabled
cat /sys/kernel/mm/redhat_transparent_hugepage/defrag
To disable it permanently, please see the corresponding documentation for the Linux distribution that you use. Here is an example of the instructions for RHEL: https://access.redhat.com/solutions/46111.
Disable Swap Usage:
Swapping behavior can be configured by setting the kernel parameter
(/proc/sys/vm/swappiness
) and can be turned off completely by executing
swapoff -a
as the root user in Linux systems. We highly recommend turning
off the swapping on the machines that run Hazelcast. When your operating systems
start swapping, garbage collection activities take much longer due to the low speed of disc access.
The Linux kernel parameter, vm.swappiness
, is a value from 0-100 that controls
the swapping of application data from physical memory to virtual memory on disk.
To prevent Linux kernel to start swapping memory to disk way too early,
we need to set the default of 60 to value between 0 and 10.
The higher the parameter value, the more aggressively inactive processes are
swapped out from physical memory. The lower the value, the less they are swapped,
forcing filesystem buffers to be emptied. In case swapping needs to be kept enabled,
we recommend setting the value between 0 and 10 to prevent the Linux kernel
to start swapping memory to disk way too early.
sudo sysctl vm.swappiness=10
VMWare ESX:
Hazelcast is certified on VMWare VSphere 5.5/ESXi 6.0. Generally speaking, Hazelcast can use all the resources on a full machine. Splitting a single physical machine into multiple virtual machines and thereby dividing resources is not required.
Consider the following for VMWare ESX:
-
Avoid sharing one Network Interface Card (NIC) between multiple virtual machine environments. A Hazelcast cluster is a distributed system and can be very network-intensive. Trying to share one physical NIC between multiple VMs may cause network-related performance problems.
-
Avoid over-committing memory. Always use dedicated physical memory for guests running Hazelcast.
-
Do not use memory ballooning.
-
Be careful overcommitting CPU cores. Monitor CPU steal time metrics.
-
Do not move guests while Hazelcast is running - for ESX this means disabling vMotion. If you want to use vMotion (live migration), first stop the Hazelcast cluster then restart it after the migration completes.
-
Always enable verbose garbage collection (GC) logs in the Java Virtual Machine. When "Real" time is higher than "User" time, this may indicate virtualization issues. The JVM is not using the CPU to execute application code during garbage collection, and is probably waiting on input/output (I/O) operations.
-
Note VMWare guests network types.
-
Use pass-through hard disks/partitions; do not use image files.
-
Configure partition groups to use a separate underlying physical machine for partition backups.
-
If you want to use automatic snapshots, first stop the Hazelcast cluster then restart it after the snapshot.
-
Network performance issues, including timeouts, might occur with LRO (Large Receive Offload) enabled on Linux virtual machines and ESXi/ESX hosts. We have specifically had this reported in VMware environments, but it could potentially impact other environments as well. We strongly recommend disabling LRO when running in virtualized environments, see https://kb.vmware.com/s/article/1027511.
Windows:
According to a reported rare case, I/O threads can consume a lot of CPU cycles
unexpectedly, even in an idle state. This can lead to CPU usage going up to 100%.
This is reported not only for Hazelcast but for other GitHub projects as well.
The workaround for such cases is to supply the system property -Dhazelcast.io.selectorMode=selectwithfix
on JVM startup.
See the related GitHub issue for more details.
Network
Hazelcast uses the network internally to shuffle data and to replicate the backups. The network is also used to read input data from and to write results to remote systems or to do RPC calls when enriching. In fact a lot of Hazelcast jobs are network-bound. A 1 Gbit network connection is a recommended minimum, but using a 10 Gbit or faster network can improve application performance. Also consider scaling the cluster out (adding more members to the cluster) to distribute the load.
Consider collocating a Hazelcast cluster with the data source and sink to avoid moving data back and forth over the wire. If you must choose between colocating Hazelcast with the source or sink, choose the source. Processed results are often aggregated, so the size is reduced.
A Hazelcast cluster is designed to run in a single LAN and can encounter unexpected performance problems if a single cluster is split across multiple different networks. Latency is the strongest constraint in most network scenarios, so deploying Hazelcast clusters to a network with high or varying latencies (even on the same LAN) can lead to unpredictable performance results.
Dedicated Network Interface Controller for Hazelcast Members
Provisioning a dedicated physical network interface controller (NIC) for Hazelcast members ensures smooth flow of data, including business data and cluster health checks, across servers. Sharing network interfaces between a Hazelcast member and another application could result in choking the port, thus causing unpredictable cluster behavior.
TCP Buffer Size
TCP uses a congestion window to determine how many packets it can send at one time; the larger the congestion window, the higher the throughput. The maximum congestion window is related to the amount of buffer space that the kernel allocates for each socket. For each socket, there is a default value for the buffer size, which you can change by using a system library call just before opening the socket. You can adjust the buffer sizes for both the receiving and sending sides of a socket.
To achieve maximum throughput, it is critical to use the optimal TCP socket buffer sizes for the links you are using to transmit data. If the buffers are too small, the TCP congestion window will never open up fully, therefore throttling the sender. If the buffers are too large, the sender can overrun the receiver such that the sending host is faster than the receiving host, which causes the receiver to drop packets and the TCP congestion window to shut down.
Typically, you can determine the throughput by the following formulae:
-
Transaction per second = buffer size / latency
-
Buffer size = Round trip time * network bandwidth
Hazelcast, by default, configures I/O buffers to 128KB; you can change these using the following Hazelcast properties:
-
hazelcast.socket.receive.buffer.size
-
hazelcast.socket.send.buffer.size
The operating system has separate configuration for minimum, default and maximum socket buffer sizes, so it is not guaranteed that the socket buffers allocated to Hazelcast sockets will match the requested buffer size.
On Linux, the following kernel parameters can be used to configure socket buffer sizes:
-
net.core.rmem_max
: maximum socket receive buffer size in bytes -
net.core.wmem_max
: maximum socket send buffer size in bytes -
net.ipv4.tcp_rmem
: minimum, default and maximum receive buffer size per TCP socket -
net.ipv4.tcp_wmem
: minimum, default and maximum send buffer size per TCP socket
To make a temporary change to one of these values, use sysctl
:
$ sysctl net.core.rmem_max=2097152
$ sysctl net.ipv4.tcp_rmem="8192 131072 6291456"
To apply changes permanently, edit file /etc/sysctl.conf
e.g.:
$ vi /etc/sysctl.conf
net.core.rmem_max = 2097152
net.ipv4.tcp_rmem = 8192 131072 6291456
Check your Linux distribution’s documentation for more information about configuring kernel parameters.
JVM
Here are the essential tips:
-
Enable garbage collection (GC) logs; since Java is getting better and better at GC, use the latest LTS version. G1GC is the default recommended GC policy.
-
Use High-Density Memory Store and a small heap; minimum and maximum heap size should be equal.
-
Applications that do a lot of querying or data updates need more headroom.
-
A basic tuning brings a huge benefit whereas more tuning may bring almost nothing else except complexity; no tuning is recommended unless needed.
-
Tuning, if done, should be reviewed periodically.
Garbage Collection
Keeping track of GC statistics is vital to optimum performance, especially if you run the JVM with large heap sizes. Tuning the garbage collector for your use case is often a critical performance practice prior to deployment. Likewise, knowing what baseline GC behavior looks like and monitoring for behavior outside normal tolerances will keep you aware of potential memory leaks and other pathological memory usage.
To avoid long GC pauses and latencies from the Java Virtual Machine (JVM), we recommend 16 GB or less of maximum JVM heap. If High-Density Memory is enabled, no more than 8 GB of maximum JVM heap is recommended. Horizontal scaling of JVM memory is recommended over vertical scaling if you want to exceed these numbers.
Enabling GC logs allows troubleshooting if performance problems occur. To enable GC logging, use the following JVM arguments:
-Xlog:gc=debug:file=/tmp/gc.log:time,uptime,level,tags:filesize=100m,filecount=10
Minimize Heap Usage
The best way to minimize the performance impact of GC is to keep heap usage small. Maintaining a small heap saves countless hours of GC tuning and provides improved stability and predictability across your entire application. Even if your application uses very large amounts of data, you can still keep your heap small by using Hazelcast’s High-Density Memory Store.
Azul Zing® and Zulu® Support
Azul Systems, the industry’s only company exclusively focused on Java and the Java Virtual Machine (JVM), builds fully supported, certified standards-compliant Java runtime solutions that help enabling real-time business. Zing is a JVM designed for enterprise Java applications and workloads that require any combination of low latency, high transaction rates, large working memory, and/or consistent response times. Zulu and Zulu Enterprise are Azul’s certified, freely available open source builds of OpenJDK with a variety of flexible support options, available in configurations for the enterprise as well as custom and embedded systems. Azul Zing is certified and supported in Hazelcast Enterprise. When deployed with Zing, Hazelcast gains performance, capacity, and operational efficiency within the same infrastructure. Additionally, you can directly use Hazelcast with Zulu without making any changes to your code.
Query Tuning
Indexes for Queried Fields
For queries on fields with ranges, you can use an ordered index. Hazelcast, by default, caches the deserialized form of the object under query in the memory when inserted into an index. This removes the overhead of object deserialization per query, at the cost of increased heap usage. See the Indexing Ranged Queries section.
Composite Indexes
Composite indexes are built on top of multiple map entry attributes; thus, increase the performance of complex queries significantly when used correctly. See the Composite Indexes section
Parallel Query Evaluation & Query Thread Pool
Setting the hazelcast.query.predicate.parallel.evaluation
property
to true
can speed up queries when using slow predicates or when there are huge
amount of entries per member.
If you’re using queries heavily, you can benefit from increasing query thread pools. See the Configuring the Query Thread Pool section.
In-Memory Format for Queries
Setting the queried entries' in-memory format to OBJECT
forces the objects
to be always kept in object format, resulting in faster access for queries, but also in
higher heap usage. It will also incur an object serialization step on every remote get operation. See the Setting In-Memory Format section.
Serialization Tuning
Hazelcast supports a range of object serialization mechanisms, each with their own costs and benefits. Choosing the best serialization scheme for your data and access patterns can greatly increase the performance of your cluster.
For an overview of serialization options with comparative advantages and disadvantages, see Serializing Objects and Classes.
Serialization Optimization Recommendations
-
Use
IMap.set()
on maps instead ofIMap.put()
if you don’t need the old value. This eliminates unnecessary deserialization of the old value. -
Set
use-native-byte-order
andallow-unsafe
totrue
in Hazelcast’s serialization configuration. Setting these properties totrue
enables fast copy of primitive arrays likebyte[]
,long[]
, etc., in your object. -
Compression is supported only by
Serializable
andExternalizable
. It has not been applied to other serializable methods because it is much slower (around three orders of magnitude slower than not using compression) and consumes a lot of CPU. However, it can reduce binary object size by an order of magnitude. -
When
enable-shared-object
is set totrue
, the Java serializer will back-reference an object pointing to a previously serialized instance. If set tofalse
, every instance is considered unique and copied separately even if they point to the same instance. The default configuration is false.
See also the Serialization Configuration Wrap-Up section for details.
Executor Service
Hazelcast executor service is an extension of Java’s built-in executor service that allows distributed execution and control of tasks. There are a number of options for Hazelcast executor service that have an impact on performance as summarized below.
Number of Threads
An executor queue may be configured to have a specific number of
threads dedicated to executing enqueued tasks. Set the number of
threads (pool-size
property in the executor service configuration)
appropriate to the number of cores available for execution.
Too few threads will reduce parallelism, leaving cores idle, while too
many threads will cause context switching overhead.
See the Configuring Executor Service section.
Bounded Execution Queue
An executor queue may be configured to have a maximum number
of tasks (queue-capacity
property in the executor service configuration).
Setting a bound on the number of enqueued tasks
will put explicit back pressure on enqueuing clients by throwing
an exception when the queue is full. This will avoid the overhead
of enqueuing a task only for it to be canceled because its execution
takes too long. It will also allow enqueuing clients to take corrective
action rather than blindly filling up work queues with tasks faster than they can be executed.
See the Configuring Executor Service section.
Avoid Blocking Operations in Tasks
Any time spent blocking or waiting in a running task is thread
execution time wasted while other tasks wait in the queue.
Tasks should be written such that they perform no potentially
blocking operations (e.g., network or disk I/O) in their run()
or call()
methods.
Locality of Reference
By default, tasks may be executed on any member. Ideally, however, tasks should be executed on the same machine that contains the data the task requires to avoid the overhead of moving remote data to the local execution context. Hazelcast executor service provides a number of mechanisms for optimizing locality of reference.
-
Send tasks to a specific member: using
ExecutorService.executeOnMember()
, you may direct execution of a task to a particular member -
Send tasks to a key owner: if you know a task needs to operate on a particular map key, you may direct execution of that task to the member that owns that key
-
Send tasks to all or a subset of members: if, for example, you need to operate on all the keys in a map, you may send tasks to all members such that each task operates on the local subset of keys, then return the local result for further processing
Scaling Executor Services
If you find that your work queues consistently reach their maximum and you have already optimized the number of threads and locality of reference, and removed any unnecessary blocking operations in your tasks, you may first try to scale up the hardware of the overburdened members by adding cores and, if necessary, more memory.
When you have reached diminishing returns on scaling up (such that the cost of upgrading a machine outweighs the benefits of the upgrade), you can scale out by adding more members to your cluster. The distributed nature of Hazelcast is perfectly suited to scaling out, and you may find in many cases that it is as easy as just configuring and deploying additional virtual or physical hardware.
Executor Services Guarantees
In addition to the regular distributed executor service, Hazelcast also offers durable and scheduled executor services. Note that when a member failure occurs, durable and scheduled executor services come with "at least once execution of a task" guarantee, while the regular distributed executor service has none. See the Durable and Scheduled executor services.
Work Queue Is Not Partitioned
Each member-specific executor will have its own private work-queue. Once a job is placed on that queue, it will not be taken by another member. This may lead to a condition where one member has a lot of unprocessed work while another is idle. This could be the result of an application call such as the following:
for(;;){
iexecutorservice.submitToMember(mytask, member)
}
This could also be the result of an imbalance caused by the application, such as in the following scenario: all products by a particular manufacturer are kept in one partition. When a new, very popular product gets released by that manufacturer, the resulting load puts a huge pressure on that single partition while others remain idle.
Work Queue Has Unbounded Capacity by Default
This can lead to OutOfMemoryError
because the number of queued tasks
can grow without bounds. This can be solved by setting the queue-capacity
property
in the executor service configuration. If a new task is submitted while the queue
is full, the call will not block, but will immediately throw a
RejectedExecutionException
that the application must handle.
No Load Balancing
There is currently no load balancing available for tasks that can run
on any member. If load balancing is needed, it may be done by creating an
executor service proxy that wraps the one returned by Hazelcast.
Using the members from the ClusterService
or member information from
SPI:MembershipAwareService
, it could route "free" tasks to a specific member based on load.
Destroying Executors
An executor service must be shut down with care because it will
shut down all corresponding executors in every member and subsequent
calls to proxy will result in a RejectedExecutionException
.
When the executor is destroyed and later a HazelcastInstance.getExecutorService
is done with the ID of the destroyed executor, a new executor will be created
as if the old one never existed.
Exceptions in Executors
When a task fails with an exception (or an error), this exception will not be logged by Hazelcast by default. This comports with the behavior of Java’s thread pool executor service, but it can make debugging difficult. There are, however, some easy remedies: either add a try/catch in your runnable and log the exception, or wrap the runnable/callable in a proxy that does the logging; the last option keeps your code a bit cleaner.
Client Executor Pool Size
Hazelcast clients use an internal executor service (different from the distributed executor service) to perform some of its internal operations. By default, the thread pool for that executor service is configured to be the number of cores on the client machine times five; e.g., on a 4-core client machine, the internal executor service will have 20 threads. In some cases, increasing that thread pool size may increase performance.
Entry Processors
Hazelcast allows you to update the whole or a part of map or cache entries in an efficient and a lock-free way using entry processors.
By default the entry processor executes on a partition thread. A partition thread is responsible for handling
one or more partitions. The design of entry processor assumes users have fast user code execution of the process()
method.
In the pathological case where the code is very heavy and executes in multi-milliseconds, this may create a bottleneck.
We have a slow user code detector which can be used to log a warning controlled by the following system properties:
-
hazelcast.slow.operation.detector.enabled
(default: true) -
hazelcast.slow.operation.detector.threshold.millis
(default: 10000)
User Code Deployment has been deprecated and will be removed in the next major version. To continue deploying your user code after this time, Community Edition users can either upgrade to Enterprise Edition, or add their resources to the Hazelcast member class paths. Hazelcast recommends that Enterprise Edition users migrate their user code to use User Code Namespaces. For further information on migrating from User Code Deployment to User Code Namespaces, see Migrate from User Code Deployment. |
The defaults catch extremely slow operations but you should set this much lower, say to 1ms, at development time to catch entry processors that could be problematic in production. These are good candidates for our optimizations.
We have two optimizations:
-
Offloadable
which moves execution off the partition thread to an executor thread -
ReadOnly
which means we can avoid taking a lock on the key
These are enabled very simply by implementing these interfaces in your entry processor. These optimizations apply to the following map methods only:
-
executeOnKey(Object, EntryProcessor)
-
submitToKey(Object, EntryProcessor)
-
submitToKey(Object, EntryProcessor, ExecutionCallback)
See the Entry Processors section.
Security
Here are the essential tips:
-
Security probably won’t be the first thing built
-
But it needs to be considered from the outset, as it affects architecture, performance and coding
-
Security can then be added before go-live without rework
TLS Tuning
You can improve TLS performance in a number of ways.
Check if securerandom.source
is configured to /dev/urandom
in your
<JAVA_HOME>/conf/security/java.security
file.
If there is /dev/random
instead, it might block on operations which
require random data generation.
The /dev/random
relies on entropy to be able to
generate random numbers. However, if this entropy is
insufficient to keep up with the rate requiring random numbers, it can slow down
the encryption/decryption since /dev/random
will
block. This can be fixed
by setting the -Djava.security.egd=file:/dev/urandom
system property.
For a more permanent solution, modify the
<JAVA_HOME>/conf/security/java.security
file and change directly the securerandom.source
property value.
Clients using Hazelcast’s Java ALL_MEMBERS
and MULTI_MEMBER
cluster routing modes automatically make use of extra I/O threads
for encryption/decryption and this has a significant impact on the performance.
The number of threads used can be changed using the hazelcast.client.io.input.thread.count
and
hazelcast.client.io.output.thread.count
client system properties.
By default it is 1 input thread and 1 output thread. If TLS/SSL is enabled,
it defaults to 3 input threads and 3 output threads.
Having more client I/O threads than members in the cluster does not lead to
an increased performance. So with a 2-member cluster,
2 in and 2 out threads give the best performance.
High-Density Memory Store
Hazelcast’s High-Density Memory Store (HDMS) is an in-memory storage option that uses native, off-heap memory to store object data instead of the JVM heap. This allows you to keep data in the memory without incurring the overhead of garbage collection (GC). HDMS capabilities are supported by the map structure, JCache implementation, Near Cache, Hibernate caching, and Web Session replications.
Available to Hazelcast Enterprise customers, HDMS is an ideal solution for those who want the performance of in-memory data, need the predictability of well-behaved Java memory management, and don’t want to spend time and effort on meticulous and fragile GC tuning.
If you use HDMS with large data sizes, we recommend a large increase in partition count, starting with 5009 or higher. See the Partition Count section above for more information. Also, if you intend to preload very large amounts of data into memory (tens, hundreds, or thousands of gigabytes), be sure to profile the data load time and to take that startup time into account prior to deployment.
See the HDMS section to learn more.
Cluster Size
Here are the essential tips:
-
Split-brain is a network break, it affects hosts
-
You can’t stop a network from a physical or logical break
-
If you have an even number of hosts, you make the problem worse
-
If you have an odd number of hosts, you make the solution simpler
-
Use an odd number of CP groups
Clusters with Huge Amount of Members/Clients
Very large clusters of hundreds of members are possible with Hazelcast, but stability depends heavily on your network infrastructure and ability to monitor and manage those many members. Distributed executions in such an environment will be more sensitive to your application’s handling of execution errors, timeouts, and the optimization of task code.
In general, you get better results with smaller clusters of Hazelcast members running on more powerful hardware and a higher number of Hazelcast clients. When running large numbers of clients, network stability is still a significant factor in overall stability. If you are running in Amazon EC2, hosting clients and members in the same zone is beneficial. Using Near Cache on read-mostly data sets reduces server load and network overhead. You may also try increasing the number of threads in the client executor pool.
Data Amount
Total data size should be calculated based on the combination of primary data and backup data. For example, if you have configured your cluster with a backup count of 2, then total memory consumed is actually 3x larger than the primary data size (primary + backup + backup). Partition sizes of 50MB or less are recommended.
Map Entries
Since entries with large size can bloat the network when deserialized, we recommend keeping each map entry’s size below 1 MB, and keeping the sizes of map entries relatively equal to each other.
Hazelcast Platform can store terabytes of data, but having a single entry with a large size may cause stability issues. If you have such entries, you should redesign your domain objects and break them into smaller ones.
Partitions
The number of internal partitions a Hazelcast member uses can be configured, but must be uniform across all members in the cluster. An optimal partition count and size establish a balance between the number of partitions on each member and the data amount on each partition. You can consider the following when deciding on a partition count.
-
The partition count should be a prime number. This helps to minimize the collision of keys across partitions, ensuring more consistent lookup times.
-
A partition count which is too low constrains the cluster. The count should be large enough for a balanced data or task distribution so that each member does not manage too few partitions.
-
A partition size of 50MB or less typically ensures good performance. Larger clusters may be able to use up to 100MB partition sizes, but will likely also require larger JVM heap sizes to accomodate the increase in data flow.
If you are a Hazelcast Enterprise customer using the High-Density Data Store with large data sizes, we recommend a large increase in partition count, starting with 5009 or higher.
The partition count cannot be easily changed after a cluster is created, so if you have a large cluster be sure to test and set an optimum partition count prior to deployment. If you need to change the partition count after a cluster is already running, you will need to schedule a maintenance window to entirely bring the cluster down. If your cluster uses the Persistence or CP Persistence features, those persistent files will need to be removed after the cluster is shut down, as they contain references to the previous partition count. Once all member configurations are updated, and any persistent data structure files are removed, the cluster can be safely restarted.