A newer version of IMDG is available.

View latest

Want to try Hazelcast 5.0 beta?

We’ve combined the in-memory storage of IMDG with the stream processing power of Jet to bring you an all new Hazelcast platform.

CP Subsystem

The CP subsystem is a component of a Hazelcast cluster that builds an in-memory strongly consistent layer. It is accessed via HazelcastInstance.getCPSubsystem(). Its data structures are CP with respect to the CAP principle, i.e., they always maintain linearizability and prefer consistency over availability during network partitions.

Currently, the CP subsystem contains only the implementations of Hazelcast’s concurrency APIs. These APIs do not maintain large states. For this reason, all members of a Hazelcast cluster do not take part in the CP subsystem. The number of members that take part in the CP subsystem is specified with CPSubsystemConfig.setCPMemberCount(int). Let’s suppose the number of CP members is configured as C. Then, when Hazelcast cluster starts, the first C members form the CP subsystem. These members are called the CP members and they can also contain data for the other regular Hazelcast data structures, such as IMap, ISet.

Data structures in the CP subsystem run in CPGroups. A CP group consists of an odd number of CPMembers between 3 and 7. Each CP group independently runs the Raft consensus algorithm. Operations are committed and executed only after they are successfully replicated to the majority of the CP members in a CP group. For instance, in a CP group of 5 CP members, operations are committed when they are replicated to at least 3 CP members. The size of CP groups is specified via CPSubsystemConfig.setGroupSize(int) and each CP group contains the same number of CP members. See the CP Subsystem Configuration section for configuration details.

Please note that the size of CP groups does not have to be same with the CP member count. Namely, the number of CP members in the CP subsystem can be larger than the configured CP group size. In this case, CP groups are formed by selecting the CP members randomly. Also note that the current CP subsystem implementation works only in memory, without persisting any state to disk. It means that a crashed CP member is not able to recover by reloading its previous state. Therefore, crashed CP members create a danger for gradually losing the majority of CP groups and eventually cause the total loss of availability of the CP subsystem. To prevent such situations, failed CP members can be removed from the CP subsystem and replaced in CP groups with other available CP members. This flexibility provides a good degree of fault tolerance at run-time. See the CP Subsystem Management section for more details.

The CP subsystem runs 2 CP groups by default. The first one is the Metadata group. It is an internal CP group which is responsible for managing the CP members and CP groups. It is initialized during the cluster startup process if the CP subsystem is enabled via CPSubsystemConfig.setCPMemberCount(int) configuration. The second group is the DEFAULT CP group, whose name is given in CPGroup.DEFAULT_GROUP_NAME. If a group name is not specified while creating a proxy for a CP data structure, that data structure is mapped to the DEFAULT CP group. For instance, when a CP IAtomicLong instance is created by calling CPSubsystem.getAtomicLong("myAtomicLong"), it will be initialized on the DEFAULT CP group. Besides these 2 predefined CP groups, custom CP groups can be created at run-time. If a CP IAtomicLong is created by calling CPSubsystem.getAtomicLong("myAtomicLong@myGroup"), first a new CP group is created with the name myGroup and then myAtomicLong is initialized on this custom CP group.

The current set of CP data structures have quite low memory overheads. Moreover, related to the Raft consensus algorithm, each CP group makes use of internal heartbeat RPCs to maintain the authority of the leader member and help lagging CP members to make progress. Last but not least, the new CP Lock and Semaphore implementations rely on a brand new session mechanism. In a nutshell, a Hazelcast member or client starts a new session on the corresponding CP group when it makes its very first Lock or Semaphore acquire request, and then periodically commits session heartbeats to this CP group to indicate its liveliness. It means that if CP Locks and Semaphores are distributed into multiple CP groups, there will be a session management overhead. See the CP Sessions section for more details. For the aforementioned reasons, we recommend you to use a minimal number of CP groups. For most use cases, the DEFAULT CP group should be sufficient to maintain all CP data structure instances. Custom CP groups could be created when the throughput of CP subsystem is needed to be improved.

API Code Sample:

        CPSubsystem cpSubsystem = hazelcastInstance.getCPSubsystem();

        IAtomicLong atomicLong = cpSubsystem.getAtomicLong(name);

        IAtomicReference atomicRef = cpSubsystem.getAtomicReference(name);

        FencedLock lock = cpSubsystem.getLock(name);

        ISemaphore semaphore = cpSubsystem.getSemaphore(name);

        ICountDownLatch latch = cpSubsystem.getCountDownLatch(name);

The CP data structure proxies differ from the other data Hazelcast structure proxies in two aspects:

  • An internal commit is performed on the METADATA CP group every time you fetch a proxy from this interface. Hence, the callers should cache the returned proxy objects.

  • If you call the DistributedObject.destroy() method on a CP data structure proxy, that data structure is terminated on the underlying CP group and cannot be reinitialized until the CP group is force-destroyed via CPSubsystemManagementService.forceDestroyCPGroup(String). For this reason, please make sure that you are completely done with a CP data structure before destroying its proxy.