As described in Invocation Lifecycle section,
for partition-based mutating invocations, such as
a caller waits with a timeout for the operation that is executed on
corresponding partition’s primary replica and backup replicas, based on
the sync backup configuration of the distributed data structure.
Hazelcast 3.9 introduces a new mechanism to detect indeterminate situations while
making such invocations. If
hazelcast.operation.fail.on.indeterminate.state system property is
enabled, a mutating invocation throws
it encounters the following cases:
The operation fails on partition primary replica member with
MemberLeftException. In this case, the caller may not determine the status of the operation. It could happen that the primary replica executes the operation, but fails before replicating it to all the required backup replicas. Even if the caller receives backup acks from some backup replicas, it cannot decide if it has received all required ack responses, since it does not know how many acks it should wait for.
There is at least one missing ack from the backup replicas for the given timeout duration. In this case, the caller knows that the operation is executed on the primary replica, but some backup may have missed it. It could be also a false-positive, if the backup timeout duration is configured with a very small value. However, Hazelcast’s active anti-entropy mechanism eventually kicks in and resolves durability of the write on all available backup replicas as long as the primary replica member is alive.
When an invocation fails with
the system does not try to rollback the changes which are executed on healthy replicas.
Effect of a failed invocation may be even observed by another caller,
if the invocation has succeeded on the primary replica.
Hence, this new behavior does not guarantee linearizability.
However, if an invocation completes without
the configuration is enabled, it is guaranteed that the operation has been
executed exactly-once on the primary replica and specified number of backup replicas of the partition.
Please note that
IndeterminateOperationStateException does not apply to
read-only operations, such as
map.get(). If a partition primary replica member crashes before
replying to a read-only operation, the operation is retried on the new owner of the primary replica.