Running Hazelcast Enterprise Edition with Persistence under Kubernetes
Hazelcast Enterprise Edition members configured with persistence enabled can monitor the Kubernetes (K8s) context and automate Hazelcast cluster state management to ensure optimal cluster behavior during shutdown and restart.
We recommend you use Hazelcast Platform Operator - see the Hazelcast Platform Operator Documentation for more information and find details about deploying a Hazelcast cluster in Kubernetes and also connecting clients outside Kubernetes. |
The benefits of enabling persistence include:
-
During a cluster-wide shutdown, the Hazelcast cluster automatically switches the cluster to a
PASSIVE
cluster state. This offers the following advantages over the behavior in previous Hazelcast Platform releases (where Hazelcast always ran clusters in anACTIVE
state):-
No data migrations are performed, speeding up the cluster shutdown.
-
No risk of out-of-memory exception due to migrations. When a cluster is in an
ACTIVE
state and Kubernetes applies the ordered shutdown of members, all data is eventually migrated to a single member (the last one in the shutdown sequence). This relieves you of the need to plan capacity for a single member to hold all the cluster data, or risk an out-of-memory exception, which was necessary with previous Hazelcast Platform releases. -
Persisted cluster metadata remain consistent across all members during shutdown. This consistency allows recovery from disk to proceed without unexpected metadata validation errors. These errors can result in the need for a force-start, wiping out persistent data from one or more members.
-
-
During temporary loss of members (for example, with a rolling restart of the cluster or a pod being rescheduled by Kubernetes), Hazelcast cluster switches to a configurable cluster state (
FROZEN
orNO_MIGRATION
) to ensure speedy recovery when the member rejoins the cluster. -
When scaling up or down, Hazelcast automatically switches the cluster to an
ACTIVE
state, so partitions are rebalanced and data is spread across all members.
For a step-by-step tutorial that details how to deploy Hazelcast with persistence enabled, see Restore a Cluster from Cloud Storage with Hazelcast Platform Operator |
Requirements
Automatic cluster state management requires:
-
Hazelcast configured with persistence enabled
-
Kubernetes discovery is in use, either explicitly configured or as auto-detected join configuration
-
Hazelcast is deployed in a
StatefulSet
-
Hazelcast is executed with a cluster role that is allowed access to
apps
Kubernetes API group andstatefulsets
resources with thewatch
verb. See the proposedClusterRole
configuration
Configuration
Automatic cluster state management is enabled by default when Hazelcast is configured with persistence enabled and Kubernetes discovery. It can be explicitly disabled by setting the hazelcast.persistence.auto.cluster.state
Hazelcast property to false
.
Depending on the use case, the hazelcast.persistence.auto.cluster.state.strategy
Hazelcast property configures which cluster state will be used when members are temporarily missing from the cluster, e.g., pod is rescheduled or rolling restart is in progress. This property has the following values:
-
NO_MIGRATION
: Use this as the cluster state if the cluster hosts a mix of persistent and in-memory data structures or if the cluster hosts only persistent data, but you favor availability over speed of recovery. While a member is missing from the cluster, cluster state switches toNO_MIGRATION
: this way, the first replica of partitions owned by the missing member are promoted and the data is available. When the member rejoins the cluster, persistent data can be recovered from disk and a differential sync (assuming Merkle trees are enabled, which is the case by default for persistentIMap
andICache
) brings them up to speed. For in-memory data structures, a full partition sync is required. This is the default value. -
FROZEN
: Use this cluster state if your cluster hosts only persistent data structures and you do not mind temporarily losing availability of partitions owned by a missing member in exchange for a speedy recovery from disk. Once the member rejoins, no synchronization over the network is required.
Best practices
When running Hazelcast with persistence in Kubernetes, the following configuration is recommended:
-
Use
/hazelcast/health/node-state
as liveness probe and/hazelcast/health/ready
as readiness probe. -
In your persistence configuration:
-
Never set a non-zero value for the
rebalance-delay-seconds
property. Automatic cluster state management switches to the cluster to the appropriate state, which may or may not allow partition rebalancing to occur depending on the detected status of the cluster. -
Use
PARTIAL_RECOVERY_MOST_COMPLETE
ascluster-data-recovery-policy
and setauto-remove-stale-data
totrue
, to minimize the need for manual interventions. Even in edge cases in which a member performs force-start (that is, cleans its persistent data directory and rejoins the cluster with a new identity), data can usually be recovered from backup partitions (assumingIMap
andICache
are configured with one or more backups).
-
-
Start Hazelcast with the
-Dhazelcast.stale.join.prevention.duration.seconds=5
Java option. Since Kubernetes can quickly reschedule pods, the default value of30
seconds is too high and can cause unnecessary delays in members rejoining the cluster.
In versions earlier than 5.2.0, persistence on Kubernetes may not work as expected. You can scale up or down, but cluster-wide shutdown can result in an Out of Memory Error (OOME) and you may have to restart the cluster manually. |