This is a prerelease version.

View latest

Overview

Tiered Storage offers an extension to Hazelcast map, which allows you to store a larger data set than available memory.

Tiered Storage allows you to work with datasets that are larger than memory using the same Hazelcast map API. It allows you to optimize infrastructure costs and performance by combining a fast in-memory layer with a cost-effective, high-capacity disk-based storage layer.

Additionally, with the extended data capacity for each member, Tiered Storage can also reduce the number of deployed Hazelcast cluster members.

Consider using Tiered Storage if any of the following applies to your use-case:

  • The data size Hazelcast manages exceeds the capacity of the memory.

  • You want lower ownership costs, despite that having an impact on performance.

  • The hot set of entries fit in the memory.

  • You have significant temporal access patterns inside a very large dataset.

  • Your project uses MapStore and you intend to use local disk to improve the access latency of entries that do not fit in the memory of your machine.

If your Enterprise Edition license was generated before Hazelcast Platform version 5.2, you’ll need a new Enterprise Edition license that enables Tiered Storage. See Managing Enterprise Edition License Keys.

Design Overview

Tiered Storage for IMaps is backed by a special-purpose, addressable data structure called HybridLog. It is called HybridLog because it seamlessly combines in-place updates with a traditional append-only log. This data structure spans the memory and the configured disk to provide the aggregated capacity of the resources available for storing IMap’s entries. There is a HybridLog data structure for every partition of every IMap.

The IMap entries stored in these HybridLog instances are internally referenced by their logical addresses. The key to logical address mapping uses a hash index component. These hash indices are hashtables using separate chaining for conflict resolution. The table of these hash indices is stored in native memory; the chaining information for the buckets is stored on the disk.

Tiered Storage ensures that every accessed IMap entry is available in the memory to build a hot cache of entries. This means that, if an entry is on the disk, it is relocated to the memory inside the HybridLog hosting the entry, and its previous location on the disk becomes unreachable. As more and more entries are relocated to the in-memory region of the HybridLog, the memory space occupied by the unreachable entries on disk is reclaimed. This reclamation is done by a compactor running synchronously with IMap operations on the same partition.

After a cluster restart, Tiered Storage ensures seamless restoration of data within the configured maps from files stored on the designated device. This data restoration requires graceful shutdowns using the HazelcastInstance.shutdown() method, while the cluster is in the PASSIVE state; see Graceful Shutdown and Cluster States. Failure to do so may lead to restart failures. The maps backed by Tiered Storage maintain consistent state synchronization, seamlessly restoring data from files upon restart.

Supported Devices

Hazelcast supports only non-volatile memory express (NVMe) local SSD devices for Tiered Storage. While HDDs and some network storage options work with the same file abstraction as local SSDs, and technically such set-ups work, the access latency of these devices compared to that of modern SSDs is so high that using Tiered Storage with such devices are not supported.

Limitations

  • Tiered Storage currently only supports Hazelcast IMap data structure.

  • Time-to-live (TTL) expiration is not supported by maps backed by Tiered Storage. This means that methods, such as IMap.put, throw UnsupportedOperationException if a TTL value is provided. If the default TTL setting is present in the map configuration, InvalidConfigurationException is thrown during the member startup.

  • Max-Idle expiration is not supported by maps backed by Tiered Storage. This means that methods, such as IMap.put, throw UnsupportedOperationException if a Max-Idle value is provided. If the default Max-Idle setting is present in the map configuration, InvalidConfigurationException is thrown during the member startup.

  • Eviction is not supported. If an eviction policy is configured for a Tiered-Storage-backed map, InvalidConfigurationException is thrown during the member startup.

  • Data Persistence and Tiered Storage are mutually exclusive features. If both are enabled simultaneously, InvalidConfigurationException is thrown during the member startup.

  • SQL is not supported for the Tiered-Storage-backed maps. UnsupportedOperationException is thrown when a SQL query is executed. To avoid this, use Predicate API instead.

  • Dynamic addition of local devices is not supported.