When a write is requested with the methods, such as
queue.offer(), a write operation is submitted to
the Hazelcast member that owns the primary replica of the specific partition.
Partition of an operation is determined based on a parameter (key of an entry or
name of the data structure, etc.) related to that operation depending on
the data structure. Target Hazelcast member is figured out by looking up
a local partition assignment/ownership table, which is updated on
each partition migration and broadcasted to all cluster eventually.
When a Hazelcast member receives a partition specific operation, it executes the operation and propagates it to backup replica(s) with a logical timestamp. Number of backups for each operation depends on the data structure and its configuration. See Threading Model - Operation Threading for threading details.
Two types of backup replication are available: sync and async. Despite what their names imply, both types are still implementations of the lazy (async) replication model. The only difference between sync and async is that, the former makes the caller block until backup updates are applied by backup replicas and acknowledgments are sent back to the caller, but the latter is just fire & forget. Number of sync and async backups are defined in the data structure configurations, and you can use a combination of sync and async backups.
When backup updates are propagated, response of the execution including
number of sync backup updates is sent to the caller and after receiving
the response, caller waits to receive the specified number of
sync backup acknowledgements for a predefined timeout.
This timeout is 5 seconds by default and defined by
the system property
(see System Properties appendix).
A backup update can be missed because of a few reasons, such as a stale partition table information on a backup replica member, network interruption, or a member crash. That’s why sync backup acks require a timeout to give up. Regardless of being a sync or async backup, if a backup update is missed, the periodically running anti-entropy mechanism detects the inconsistency and synchronizes backup replicas with the primary. Also the graceful shutdown procedure ensures that all backup replicas for partitions whose primary replicas are assigned to the shutting down member will be consistent.
In some cases, although the target member of an invocation is assumed to be
alive by the failure detector, the target may not execute the operation or
send the response back in time. Network splits, long pauses caused by
high load, GC or I/O (disk, network) can be listed as a few possible reasons.
When an invocation doesn’t receive any response from the member that owns
primary replica, then invocation fails with an
This timeout is 2 minutes by default and defined by
the system property
(System Properties appendix).
When timeout is passed, result of the invocation will be indeterminate.