A newer version of Hazelcast is available.

View latest

Managing and Monitoring the CP Subsystem

The CP Subsystem requires manual intervention while expanding or shrinking its size, or when a CP member crashes or becomes unreachable. When a CP member becomes unreachable, it is not automatically removed from CP Subsystem because it could still be active and partitioned away.

Tools

You can manage and monitor the CP Subsystem, using the following:

  • Java member API

  • REST interface

    • REST endpoint URL

    • hz-cluster-cp-admin shell script, which comes with the Hazelcast package.

      The hz-cluster-cp-admin script uses the curl command, so curl must be installed to be able to use the script.

Shutting Down CP Members

There is a significant difference between how to shut down a CP member when CP Subsystem Persistence is enabled or disabled:

  • Persistence disabled: You can shut down only N-2 CP members at the same time. The remaining CP members must be shut down one at a time.

  • Persistence enabled: You can shut down your CP members at the same time.

To shut down members, see Shutting Down Members and Clusters.

Handling a Lost Majority

When the majority of a CP group is lost, that CP group cannot make progress anymore. New CP members cannot even join a CP group that has lost majority because membership changes must also go through the Raft consensus algorithm.

  • If the group is not the METADATA CP group, it must be force-destroyed immediately, because it can block the METADATA CP group from performing membership changes on the CP Subsystem.

  • If the majority of the METADATA CP group permanently crashes, it is equivalent to the permanent crash of the whole CP Subsystem, even though other CP groups are running fine. In this case, you must reset the CP Subsystem.

Listening for Events

The CP Subsystem provides the following listeners:

  • CP membership listeners

  • CP group availability listeners

CP Membership Listeners

CPMembershipListener is notified when a CP member is added to or removed from the CP Subsystem.

The listener interface has methods that are invoked for the following events:

  • memberAdded: A new CP member is added to the CP subsystem.

  • memberRemoved: An existing CP member is removed from the CP subsystem.

To get notified for CP membership events, implement the CPMembershipListener interface.

The following is an example CPMembershipListener class:

public class CPMembershipListenerImpl implements CPMembershipListener {

    /**
     * Called when a new CP member is added to the CP Subsystem.
     */
    public void memberAdded(CPMembershipEvent event) {
        System.out.println("Added: " + event);
    }

    /**
     * Called when a CP member is removed from the CP Subsystem.
     */
    public void memberRemoved(CPMembershipEvent event) {
        System.out.println("Removed: " + event);
    }
}

CPMembershipListener can be defined in the configuration or can be registered at runtime with the CPSubsystem API.

Below is an example registering the listener at runtime, using the CPSubsystem.addMembershipListener() method:

// Either server or client
HazelcastInstance hazelcastInstance = ...;
hazelcastInstance.getCPSubsystem().addMembershipListener(new CPMembershipListenerImpl());
  • Java member API

  • Java client API

  • Member XML

  • Member YAML

  • Client XML

  • Client YAML

Config config = new Config();
config.addListenerConfig(new ListenerConfig("com.yourpackage.CPMembershipListenerImpl"));
ClientConfig config = new ClientConfig();
config.addListenerConfig(new ListenerConfig("com.yourpackage.CPMembershipListenerImpl"));
<hazelcast>
    ...
    <listeners>
        <listener>
            com.yourpackage.CPMembershipListenerImpl
        </listener>
    </listeners>
    ...
</hazelcast>
hazelcast:
  ...
  listeners:
    - com.yourpackage.CPMembershipListenerImpl
<hazelcast-client>
    ...
    <listeners>
        <listener>
            com.yourpackage.CPMembershipListenerImpl
        </listener>
    </listeners>
    ...
</hazelcast-client>
hazelcast-client:
  ...
  listeners:
    - com.yourpackage.CPMembershipListenerImpl

CP Group Availability Listeners

CPGroupAvailabilityListener is notified when the availability of a CP group decreases or it loses the majority completely.

In general, the availability decreases when a CP member becomes unreachable because of a process crash, network partition, or out-of-memory error. Once a member is declared unavailable by the failure detector, that member is removed from the cluster. If it is also a CP member, a CPGroupAvailabilityEvent is fired for each CP group that member belongs to.

CPGroupAvailabilityListener has a separate method to report the loss of majority.

The listener interface has methods that are invoked for the following events:

  • availabilityDecreased: A CP group’s availability decreases, but still has the majority of members available.

  • majorityLost: A CP group has lost its majority.

The following is an example CPGroupAvailabilityListener class:

public class CPGroupAvailabilityListenerImpl implements CPGroupAvailabilityListener {

    /**
     * Called when a CP group's availability decreases,
     * but still has the majority of members available.
     */
    public void availabilityDecreased(CPGroupAvailabilityEvent event) {
        System.out.println("Availability decreased: " + event);
    }

    /**
     * Called when a CP group has lost its majority.
     */
    public void majorityLost(CPGroupAvailabilityEvent event) {
        System.out.println("Majority Lost: " + event);
    }
}

A CPGroupAvailabilityListener can be defined in the configuration or can be registered at runtime with the CPSubsystem API.

  • Java member API

  • Java client API

  • Member XML

  • Member YAML

  • Client XML

  • Client YAML

Config config = new Config();
config.addListenerConfig(new ListenerConfig("com.yourpackage.CPGroupAvailabilityListenerImpl"));
ClientConfig config = new ClientConfig();
config.addListenerConfig(new ListenerConfig("com.yourpackage.CPGroupAvailabilityListenerImpl"));
<hazelcast>
    ...
    <listeners>
        <listener>
            com.yourpackage.CPGroupAvailabilityListenerImpl
        </listener>
    </listeners>
    ...
</hazelcast>
hazelcast:
  ...
  listeners:
    - com.yourpackage.CPGroupAvailabilityListenerImpl
<hazelcast-client>
    ...
    <listeners>
        <listener>
            com.yourpackage.CPGroupAvailabilityListenerImpl
        </listener>
    </listeners>
    ...
</hazelcast-client>
hazelcast-client:
  ...
  listeners:
    - com.yourpackage.CPGroupAvailabilityListenerImpl

Getting a Local CP Member

Get the local CP member if this Hazelcast member is a part of CP Subsystem.

  • Java member API

  • REST API

        CPMember localMember = cpSubsystem.getLocalCPMember();
curl http://127.0.0.1:5701/hazelcast/rest/cp-subsystem/members/local

# OR

hz-cluster-cp-admin -o get-local-member --address 127.0.0.1 --port 5701
Sample response
{
    "uuid": "6428d7fd-6079-48b2-902c-bdf6a376051e",
    "address": "[127.0.0.1]:5701"
}

Getting CP Groups

To see the list of active CP groups:

  • Java member API

  • REST API

        CPSubsystemManagementService managementService = cpSubsystem.getCPSubsystemManagementService();
        CompletionStage<Collection<CPGroupId>> future = managementService.getCPGroupIds();
        Collection<CPGroupId> groups = future.toCompletableFuture().get();
curl http://127.0.0.1:5701/hazelcast/rest/cp-subsystem/groups

# OR

hz-cluster-cp-admin -o get-groups --address 127.0.0.1 --port 5701
Sample response
[{
    "name": "METADATA",
    "id": 0
}, {
    "name": "atomics",
    "id": 8
}, {
    "name": "locks",
    "id": 14
}]

Getting a Single CP Group

Find information about an active CP group with a given name.

  • Java member API

  • REST API

        CPSubsystemManagementService managementService = cpSubsystem.getCPSubsystemManagementService();
        CompletionStage<CPGroup> future = managementService.getCPGroup(groupName);
        CPGroup group = future.toCompletableFuture().get();
curl http://127.0.0.1:5701/hazelcast/rest/cp-subsystem/groups/${CPGROUP_NAME}

# OR

hz-cluster-cp-admin -o get-group --group ${CPGROUP_NAME} --address 127.0.0.1 --port 5701
Sample response
{
    "id": {
        "name": "locks",
        "id": 14
    },
    "status": "ACTIVE",
    "members": [{
        "uuid": "33f84b0f-46ba-4a41-9e0a-29ee284c1c2a",
        "address": "[127.0.0.1]:5703"
    }, {
        "uuid": "59ca804c-312c-4cd6-95ff-906b2db13acb",
        "address": "[127.0.0.1]:5704"
    }, {
        "uuid": "777ff6ea-b8a3-478d-9642-47d1db019b37",
        "address": "[127.0.0.1]:5705"
    }, {
        "uuid": "c7856e0f-25d2-4717-9919-88fb3ecb3384",
        "address": "[127.0.0.1]:5702"
    }, {
        "uuid": "c6229b44-8976-4602-bb57-d13cf743ccef",
        "address": "[127.0.0.1]:5701"
    }]
}

Getting CP Members

Get a list of all active CP members in the cluster.

  • Java member API

  • REST API

        CPSubsystemManagementService managementService = cpSubsystem.getCPSubsystemManagementService();
        CompletionStage<Collection<CPMember>> future = managementService.getCPMembers();
        Collection<CPMember> members = future.toCompletableFuture().get();
curl http://127.0.0.1:5701/hazelcast/rest/cp-subsystem/members

# OR

hz-cluster-cp-admin -o get-members --address 127.0.0.1 --port 5701
Sample response
[{
    "uuid": "33f84b0f-46ba-4a41-9e0a-29ee284c1c2a",
    "address": "[127.0.0.1]:5703"
}, {
    "uuid": "59ca804c-312c-4cd6-95ff-906b2db13acb",
    "address": "[127.0.0.1]:5704"
}, {
    "uuid": "777ff6ea-b8a3-478d-9642-47d1db019b37",
    "address": "[127.0.0.1]:5705"
}, {
    "uuid": "c6229b44-8976-4602-bb57-d13cf743ccef",
    "address": "[127.0.0.1]:5701"
}, {
    "uuid": "c7856e0f-25d2-4717-9919-88fb3ecb3384",
    "address": "[127.0.0.1]:5702"
}]

Getting CP Group Sessions

Get all CP sessions that are currently active in a given CP group.

  • Java member API

  • REST API

        CPSessionManagementService sessionManagementService = cpSubsystem.getCPSessionManagementService();
        CompletionStage<Collection<CPSession>> future = sessionManagementService.getAllSessions(groupName);
        Collection<CPSession> sessions = future.toCompletableFuture().get();
curl http://127.0.0.1:5701/hazelcast/rest/cp-subsystem/groups/${CPGROUP_NAME}/sessions

#OR

hz-cluster-cp-admin -o get-sessions --group ${CPGROUP_NAME} --address 127.0.0.1 --port 5701
Sample response
[{
    "id": 1,
    "creationTime": 1549008095530,
    "expirationTime": 1549008766630,
    "version": 73,
    "endpoint": "[127.0.0.1]:5701",
    "endpointType": "SERVER",
    "endpointName": "hz-member-1"
}, {
    "id": 2,
    "creationTime": 1549008115419,
    "expirationTime": 1549008765425,
    "version": 71,
    "endpoint": "[127.0.0.1]:5702",
    "endpointType": "SERVER",
    "endpointName": "hz-member-2"
}]

Destroying a CP Group by Force

You can destroy the given active CP group without using the Raft algorithm mechanics. This method must be used only when a CP group loses its majority and cannot make progress anymore. Normally, membership changes in CP groups, such as CP member promotion or removal, are done via the Raft consensus algorithm. However, when a CP group permanently loses its majority, it will not be able to commit any new operation. Therefore, this method ungracefully terminates the remaining members of the given CP group on the remaining CP group members. It also performs a Raft commit to the METADATA CP group in order to update the status of the destroyed group. Once a CP group is destroyed, all CP data structure proxies created before the destroy fails with CPGroupDestroyedException. However, if a new proxy is created afterwards, then this CP group is re-created from scratch with a new set of CP members.

This method is idempotent. It has no effect if the given CP group is already destroyed.

  • Management Center

  • Java member API

  • REST API

You cannot yet destroy a CP group from Management Center.

        CPSubsystemManagementService managementService = cpSubsystem.getCPSubsystemManagementService();
        CompletionStage<Void> future = managementService.forceDestroyCPGroup(groupName);
        future.toCompletableFuture().get();
curl -X POST --data "${GROUPNAME}&${PASSWORD}" http://127.0.0.1:5701/hazelcast/rest/cp-subsystem/groups/${CPGROUP_NAME}/remove

# OR

hz-cluster-cp-admin -o force-destroy-group --group ${CPGROUP_NAME} --address 127.0.0.1 --port 5701 --groupname ${GROUPNAME} --password ${PASSWORD}

Removing a CP Member

You can remove a given CP member from the active CP members list and all CP groups it belongs to. If any other active CP member is available, it replaces the removed CP member in its CP groups. Otherwise, CP groups of which the removed CP member is a member shrinks and their majority values are recalculated.

Before removing a CP member from the CP Subsystem, make sure that it is declared as unreachable by the failure detector and removed from the member list. The behavior is undefined when a running CP member is removed from the CP Subsystem.
  • Java member API

  • REST API

        CPSubsystemManagementService managementService = cpSubsystem.getCPSubsystemManagementService();
        CompletionStage<Void> future = managementService.removeCPMember(memberUUID);
        future.toCompletableFuture().get();
curl -X POST --data "${GROUPNAME}&${PASSWORD}" http://127.0.0.1:5701/hazelcast/rest/cp-subsystem/members/${CPMEMBER_UUID}/remove

# OR

hz-cluster-cp-admin -o remove-member --member ${CPMEMBER_UUID} --address 127.0.0.1 --port 5701 --groupname ${GROUPNAME} --password ${PASSWORD}

Promoting a Local Member to a CP Member

A new CP member can be added to the CP Subsystem to either increase the number of available CP members for new CP groups or to fill the missing slots in existing CP groups. After the initial Hazelcast cluster startup is done, an existing Hazelcast member can be be promoted to the CP member role. This new CP member automatically joins to CP groups that have missing members, and majority values of these CP groups are recalculated.

If the local member is already a CP member, this method has no effect.

The promoted CP member will be added to the CP groups that have missing members. A group that is missing members is one where the current size of the group is smaller than the configured group-size.

When a member becomes a CP member, it generates an additional UUID that other CP members can use to identify it. You will see this CP UUID in the following places:

  • Requests to REST endpoints in the CP group

  • Responses from REST endpoints in the CP group

  • Member logs

  • Management Center

  • Java member API

  • REST API

        CPSubsystemManagementService managementService = cpSubsystem.getCPSubsystemManagementService();
        CompletionStage<Void> future = managementService.promoteToCPMember();
        future.toCompletableFuture().get();
curl -X POST --data "${GROUPNAME}&${PASSWORD}" http://127.0.0.1:5701/hazelcast/rest/cp-subsystem/members

# OR

hz-cluster-cp-admin -o promote-member --address 127.0.0.1 --port 5701 --groupname ${GROUPNAME} --password ${PASSWORD}

Wiping and Resetting the CP Subsystem

You must wipe and reset the whole CP Subsystem state only when the METADATA CP group loses its majority and cannot make progress anymore.

After this method is called, all CP state and data are wiped, including data written by CP Subsystem Persistence, and CP members start with empty state.

This method can be invoked only from the Hazelcast master member, which is the first member in the Hazelcast cluster member list. Moreover, the Hazelcast cluster must have at least the number of members configgured in the cp-member-count option.

This method must not be called while there are membership changes in the Hazelcast cluster. Before calling this method, make sure that there is no new member joining and all existing Hazelcast members have seen the same member list.

This method triggers a new CP discovery process round. However, if the new CP discovery round fails for any reason, Hazelcast members are not terminated, because Hazelcast members are likely to contain data for AP data structures and their termination can cause data loss. Hence, you need to observe the cluster and check if the CP discovery process completes successfully.

This method is NOT idempotent and multiple invocations can break the whole system. After calling this API, you must observe the system to see if the reset process is successfully completed or failed before making another call.
  • Java member API

  • REST API

        CPSubsystemManagementService managementService = cpSubsystem.getCPSubsystemManagementService();
        CompletionStage<Void> future = managementService.reset();
        future.toCompletableFuture().get();
curl -X POST --data "${GROUPNAME}&${PASSWORD}" http://127.0.0.1:5701/hazelcast/rest/cp-subsystem/reset

# OR

hz-cluster-cp-admin -o reset --address 127.0.0.1 --port 5701 --groupname ${GROUPNAME} --password ${PASSWORD}

Forcing a Session to Close

If a Hazelcast instance that owns a CP session crashes, its CP session is not terminated immediately. Instead, the session is closed after the configured session-time-to-live-seconds passes. If it is known for sure that the session owner is not partitioned and definitely crashed, this method can be used for closing the session and releasing its resources immediately.

  • Management Center

  • Java member API

  • REST API

You cannot yet force a session to close from Management Center.

        CPSessionManagementService sessionManagementService = cpSubsystem.getCPSessionManagementService();
        CompletionStage<Boolean> future = sessionManagementService.forceCloseSession(groupName, sessionId);
        future.toCompletableFuture().get();
curl -X POST --data "${GROUPNAME}&${PASSWORD}" http://127.0.0.1:5701/hazelcast/rest/cp-subsystem/groups/${CPGROUP_NAME}/sessions/${CP_SESSION_ID}/remove

# OR

hz-cluster-cp-admin -o force-close-session --group ${CPGROUP_NAME} --session-id ${CP_SESSION_ID} --address 127.0.0.1 --port 5701 --groupname ${GROUPNAME} --password ${PASSWORD}