Partition Group Configuration

Hazelcast distributes key objects into partitions using the consistent hashing algorithm. Multiple replicas are created for each partition and those partition replicas are distributed among Hazelcast members. An entry is stored in the members that own replicas of the partition to which the entry’s key is assigned. The total partition count is 271 by default; you can change it with the configuration property hazelcast.partition.count. See the System Properties appendix.

Hazelcast member that owns the primary replica of a partition is called as the partition owner. Other replicas are called backups. Based on the configuration, a key object can be kept in multiple replicas of a partition. A member can hold at most one replica of a partition (ownership or backup).

By default, Hazelcast distributes partition replicas randomly and equally among the cluster members, assuming all members in the cluster are identical. But what if some members share the same JVM or physical machine or chassis and you want backups of these members to be assigned to members in another machine or chassis? What if processing or memory capacities of some members are different and you do not want an equal number of partitions to be assigned to all members?

To deal with such scenarios, you can group members in the same JVM (or physical machine) or members located in the same chassis. Or you can group members to create identical capacity. We call these groups partition groups. Partitions are assigned to those partition groups instead of individual members. Backup replicas of a partition which is owned by a partition group are located in other partition groups.

Grouping Types

When you enable partition grouping, Hazelcast presents the following choices for you to configure partition groups.

HOST_AWARE

You can group members automatically using the IP addresses of members, so members sharing the same network interface are grouped together. All members on the same host (IP address or domain name) form a single partition group. This helps to avoid data loss when a physical server crashes, because multiple replicas of the same partition are not stored on the same host. But if there are multiple network interfaces or domain names per physical machine, this assumption is invalid.

The following are declarative and programmatic configuration snippets that show how to enable HOST_AWARE grouping:

XML
YAML
Java

<hazelcast>
    <partition-group enabled="true" group-type="HOST_AWARE"/>
</hazelcast>

hazelcast:
  partition-group:
    enabled: true
    group-type: HOST_AWARE

Config config = ...;
PartitionGroupConfig partitionGroupConfig = config.getPartitionGroupConfig();
partitionGroupConfig.setEnabled( true )
    .setGroupType( MemberGroupType.HOST_AWARE );

ZONE_AWARE

You can use ZONE_AWARE configuration when you want to back up data on different Availability Zones (AZ) in the following environments:

These services write zone information to the Hazelcast member attributes map during the discovery process. When ZONE_AWARE is configured as partition group type, Hazelcast creates the partition groups with respect to member attributes map entries that include zone information. That means backups are created in the other zones and each zone is accepted as one partition group.

When using the ZONE_AWARE partition grouping, a Hazelcast cluster spanning multiple AZs should have an equal number of members in each AZ. Otherwise, it results in uneven partition distribution among the members.

The following are declarative and programmatic configuration snippets that show how to enable ZONE_AWARE grouping:

XML
YAML
Java

<hazelcast>
    <partition-group enabled="true" group-type="ZONE_AWARE" />
</hazelcast>

hazelcast:
  partition-group:
    enabled: true
    group-type: ZONE_AWARE

Config config = ...;
PartitionGroupConfig partitionGroupConfig = config.getPartitionGroupConfig();
partitionGroupConfig.setEnabled( true )
    .setGroupType( MemberGroupType.ZONE_AWARE );

PLACEMENT_AWARE

You can group members according to their placement metadata provided by the cloud providers. This metadata indicates the placement information, such as rack, fault domain, power sources, network, and resources of a virtual machine in a zone.

This grouping provides a finer granularity than ZONE_AWARE and is useful for good redundancy when running members within a single availability zone; it provides availability within a single zone as high as possible by spreading the partitions and their replicas across different racks.

This grouping is currently supported in Hazelcast AWS Discovery plugin. See also AWS' documentation on placement groups for more information.

The following are declarative and programmatic configuration snippets that show how to enable PLACEMENT_AWARE grouping:

XML
YAML
Java

<hazelcast>
    <partition-group enabled="true" group-type="PLACEMENT_AWARE" />
</hazelcast>

hazelcast:
  partition-group:
    enabled: true
    group-type: PLACEMENT_AWARE

Config config = ...;
PartitionGroupConfig partitionGroupConfig = config.getPartitionGroupConfig();
partitionGroupConfig.setEnabled( true )
    .setGroupType( MemberGroupType.PLACEMENT_AWARE );

NODE_AWARE

You can use NODE_AWARE configuration when you want to back up data on different pods in Kubernetes environments.

When NODE_AWARE is configured as the partition group type, Hazelcast creates the partition groups with respect to member attributes map’s entries that include the node information. That means backups are created in the other nodes and each node is accepted as one partition group.

Hazelcast writes zone information to the Hazelcast member attributes map during the discovery process.

When using the NODE_AWARE partition grouping, the orchestration tool must distribute Hazelcast containers/pods equally between the nodes. Otherwise, it results in uneven partition distribution among the members.

The following are declarative and programmatic configuration snippets that show how to enable NODE_AWARE grouping:

XML
YAML
Java

<hazelcast>
    <partition-group enabled="true" group-type="NODE_AWARE" />
</hazelcast>

hazelcast:
  partition-group:
    enabled: true
    group-type: NODE_AWARE

Config config = ...;
PartitionGroupConfig partitionGroupConfig = config.getPartitionGroupConfig();
partitionGroupConfig.setEnabled( true )
    .setGroupType( MemberGroupType.NODE_AWARE );

PER_MEMBER

You can give every member its own group. Each member is a group of its own and primary and backup partitions are distributed randomly (not on the same physical member). This gives the least amount of protection and is the default configuration for a Hazelcast cluster. This grouping type provides good redundancy when Hazelcast members are on separate hosts. However, if multiple instances run on the same host, this type is not a good option.

The following are declarative and programmatic configuration snippets that show how to enable PER_MEMBER grouping:

XML
YAML
Java

<hazelcast>
    <partition-group enabled="true" group-type="PER_MEMBER" />
</hazelcast>

hazelcast:
  partition-group:
    enabled: true
    group-type: PER_MEMBER

Config config = ...;
PartitionGroupConfig partitionGroupConfig = config.getPartitionGroupConfig();
partitionGroupConfig.setEnabled( true )
    .setGroupType( MemberGroupType.PER_MEMBER );

CUSTOM

You can do custom grouping using Hazelcast’s interface matching configuration. This way, you can add different and multiple interfaces to a group. You can also use wildcards in the interface addresses. For example, the users can create rack-aware or data warehouse partition groups using custom partition grouping.

The following are declarative and programmatic configuration examples that show how to enable and use CUSTOM grouping:

XML
YAML
Java

<hazelcast>
    <partition-group enabled="true" group-type="CUSTOM">
        <member-group>
            <interface>10.10.0.*</interface>
            <interface>10.10.3.*</interface>
            <interface>10.10.5.*</interface>
        </member-group>
        <member-group>
            <interface>10.10.10.10-100</interface>
            <interface>10.10.1.*</interface>
            <interface>10.10.2.*</interface>
        </member-group>
    </partition-group>
</hazelcast>

hazelcast:
  partition-group:
    enabled: true
    group-type: CUSTOM
    member-group:
      - - 10.10.0.*
        - 10.10.3.*
        - 10.10.5.*
      - - 10.10.10.10-100
        - 10.10.1.*
        - 10.10.2.*

        Config config = new Config();
        PartitionGroupConfig partitionGroupConfig = config.getPartitionGroupConfig();
        partitionGroupConfig.setEnabled( true )
                .setGroupType( PartitionGroupConfig.MemberGroupType.CUSTOM );

        MemberGroupConfig memberGroupConfig = new MemberGroupConfig();
        memberGroupConfig.addInterface( "10.10.0.*" )
                .addInterface( "10.10.3.*" ).addInterface("10.10.5.*" );

        MemberGroupConfig memberGroupConfig2 = new MemberGroupConfig();
        memberGroupConfig2.addInterface( "10.10.10.10-100" )
                .addInterface( "10.10.1.*").addInterface( "10.10.2.*" );

        partitionGroupConfig.addMemberGroupConfig( memberGroupConfig );
        partitionGroupConfig.addMemberGroupConfig( memberGroupConfig2 );

While your cluster was forming, if you configured your members to discover each other by their IP addresses, you should use the IP addresses for the <interface> element. If your members discovered each other by their host names, you should use host names.

SPI

You can provide your own partition group implementation using the SPI configuration. To create your partition group implementation, you need to first extend the DiscoveryStrategy class of the discovery service plugin, override the method public PartitionGroupStrategy getPartitionGroupStrategy() and return the PartitionGroupStrategy configuration in that overridden method.

The following code covers the implementation steps mentioned in the above paragraph:

public class CustomDiscovery extends AbstractDiscoveryStrategy {

    public CustomDiscovery(ILogger logger, Map<String, Comparable> properties) {
        super(logger, properties);
    }

    @Override
    public Iterable<DiscoveryNode> discoverNodes() {
        Iterable<DiscoveryNode> iterable = //your implementation
        return iterable;
    }

    @Override
    public PartitionGroupStrategy getPartitionGroupStrategy() {
        return new CustomPartitionGroupStrategy();
    }

    private class CustomPartitionGroupStrategy implements PartitionGroupStrategy {
        @Override
        public Iterable<MemberGroup> getMemberGroups() {
            Iterable<MemberGroup> iterable = //your implementation
            return iterable;
        }
    }
}