This is a prerelease version.

View latest

Glossary

2-phase commit

An atomic commitment protocol for distributed systems. It consists of two phases: commit-request and commit. In commit-request phase, transaction manager coordinates all the transaction resources to commit or abort. In commit-phase, transaction manager decides to finalize operation by committing or aborting according to the votes of each transaction resource.

ACID

A set of properties (Atomicity, Consistency, Isolation, Durability) guaranteeing that transactions are processed reliably. Atomicity requires that each transaction be all or nothing, i.e., if one part of the transaction fails, the entire transaction fails. Consistency ensures that only valid data following all rules and constraints is written. Isolation ensures that transactions are securely and independently processed at the same time without interference (and without transaction ordering). Durability means that once a transaction has been committed, it will remain so, no matter if there is a power loss, crash, or error.

BFF

Backend-for-frontend. A web development architectural pattern that creates a dedicated backend for each frontend application.

Blue/Green Deployment

A client connection technique in Hazelcast Enterprise Edition that reduces system downtime by deploying two mirrored clusters: a blue (active) cluster and a green (standby) cluster. Clients initially connect to the blue cluster and can be automatically redirected to the green cluster in case of failure or for maintenance. This mechanism allows for seamless failover and phased cutover, ensuring high availability and minimal disruption during upgrades or disaster recovery.

Cache

A high-speed access area that can be either a reserved section of main memory or a storage device.

Cache access patterns

Strategies for optimizing memory access to improve performance and reduce latency. The most common access patterns are read-through, write-through, and write-behind.

Caching

Often referred to as memory caching, this is a technique that allows computer applications to temporarily store data in a computer’s memory for quick retrieval.

Cache miss

An event in which a system or application makes a request to retrieve data from a cache, but that specific data is not currently in cache memory.

Caller

A Hazelcast member or client that interacts with the CP Subsystem through APIs.

CAP theorem

States that in a distributed system, at best only two of the following three properties can be guaranteed: Consistency (all clients see the same, up-to-date data), Availability (every request receives a valid response), and Partition Tolerance (the system continues operating despite network partitions).

CDC

Change data capture is a data pipeline pattern for observing changes made to a database and extracting them in a form usable by other systems, for the purposes of replication, analysis and more.

CEP

Complex event processing is a set of techniques for capturing and analyzing streams of data as they arrive to identify opportunities or threats in real time.

CLC

Hazelcast Command Line Client, used to connect to and interact with clusters on Hazelcast Cloud and Hazelcast Platform direct from the command line or through scripts.

CLI

Command-line interface.

Client/server topology

Hazelcast topology where members run outside the user application and are connected to clients using client libraries. The client library is installed in the user application.

Cloud-native architecture

An architecture in which an application is designed specifically for cloud deployment (in contrast to “lift-and-shift”, where applications are moved to the cloud unchanged).

Compact Serialization

Hazelcast’s schema-based, language-independent serialization format that separates schema from data to reduce memory/bandwidth, supports zero configuration for Java/.NET classes, enables schema evolution, and allows partial deserialization for faster queries and indexing.

Connector

A Hazelcast integration component that bridges external systems and a Hazelcast cluster to move data in or out of pipelines. Connectors operate as sources (ingesting batch or streaming data into Hazelcast) or sinks (delivering pipeline results to external systems). Hazelcast provides many built-in connectors (for example, Kafka, files, JDBC, MongoDB, and Hazelcast data structures) and supports pre-built Kafka Connect Source connectors; you can also build custom connectors via the Jet API.

CP Subsystem

A Hazelcast cluster component that provides strongly consistent, linearizable distributed data structures (such as CPMap, FencedLock, AtomicLong, Semaphore, AtomicReference, and CountDownLatch) using the Raft consensus algorithm. It prioritizes consistency over availability during network partitions and is intended for coordination and synchronization use cases.

DaaS

Data as a Service is a data management strategy and/or deployment model that focuses on the cloud (public or private) to deliver a variety of data-related services such as storage, processing, and analytics.

Data-at-rest

Data that is stored in a digital form on a physical device, which is not being actively accessed or used.

Data grid

A set of computers that directly interact with each other to coordinate the processing of large jobs.

Data-in-motion

A term that refers to digital data sets that are currently being moved over a network from one location to another.

DAG

A directed acyclic graph is a conceptual representation of a series of activities depicted by a graph.

Data pipeline

A series of actions that ingest data from one or more sources and move it to a destination for storage and analysis.

Deserialization

The process of reconstructing a data structure or object from a series of bytes or a string in order to instantiate the object for consumption.

Diagnostic logging

A Hazelcast feature that captures detailed system, configuration, and performance information at regular intervals using diagnostics plugins. Logs are written to dedicated files and can be configured for rotation, retention, and output destination to aid troubleshooting and analysis.

Distributed cache

A system that pools together the random-access memory (RAM) of multiple networked computers into a single in-memory data store used as a data cache to provide fast access to data.

Distributed computing

Sometimes called distributed processing, this is the technique of linking together multiple computer servers over a network into a cluster, to share data and to coordinate processing power.

Distributed hash table

A decentralized data store that looks up data based on key-value pairs.

Distributed transaction

A set of operations on data that is performed across two or more data repositories (especially databases).

EDA

Event-driven architecture is a modern software design approach centered around data that describes events—selection of a button on a user interface (UI), the addition of an item to an online shopping cart, notification of payment on a point of sale (POS) system, etc.—in real-time and enables applications to act on them as they occur.

Edge computing

Sometimes called IoT edge processing, this refers to taking action on data as near to the source as possible rather than in a central, remote data center, to reduce latency and bandwidth use.

Embedded topology

Hazelcast topology where the members are in-process with the user application and act as both client and server.

ESP

Event stream processing is the practice of taking action on a series of data points that originate from a system that continuously creates data. The term “event” refers to each data point in the system, and “stream” refers to the ongoing delivery of those events.

ETL

Extract transform load is a data pipeline pattern for collecting data from various sources, transforming (changing) it to conform to some rules, and loading it into a sink.

Event-driven architecture (EDA)

A software design approach centered around data that describes events (such as selection of a button on a user interface (UI), the addition of an item to an online shopping cart, notification of payment on a point of sale (POS) system, etc.) in real-time and enables applications to act on them as they occur.

Event stream processing (ESP)

The practice of taking action on a series of data points that originate from a system that continuously creates data. The term “event” refers to each data point in the system, and “stream” refers to the ongoing delivery of those events. A series of events can also be referred to as “streaming data” or “data streams”.

Garbage collection

The recovery of storage that is being used by an application when that application no longer needs the storage. This frees the storage for use by other applications (or processes within an application). It also ensures that an application using increasing amounts of storage does not reach its quota. Programming languages that use garbage collection are often interpreted within virtual machines like the JVM. The environment that runs the code is also responsible for garbage collection.

Geo Replication

See WAN Replication.

Grid computing

The practice of leveraging multiple computers, often geographically distributed but connected by networks, to work together to accomplish joint tasks.

Hazelcast clients

Libraries that run outside the cluster and connect to Hazelcast members over the network to access and operate on distributed data structures and services. Hazelcast provides clients for Java, .NET, Python, C++, Go, Node.js.

Hazelcast cluster

A virtual environment formed by Hazelcast members communicating with each other in the cluster.

Hazelcast partition

Memory segments containing data. Hazelcast automatically distributes data into primary and backup partitions across the cluster. Each partition can have hundreds or thousands of data entries, depending on your memory capacity. You can think of a partition as a block of data. In general and optimally, a partition should have a maximum size of 50-100 megabytes.

Hazelcast Platform

Hazelcast unifies a fast in-memory data store with distributed compute and a low-latency stream processing engine to deliver real-time, scalable, and resilient applications across event streams and data sources. Hazelcast can be deployed in a range of environments from the network edge to public and private clouds.

HD Memory (High-Density Memory)

A Hazelcast Enterprise Edition feature that provides an off-heap, native memory store for data structures such as Map, JCache, and Near Cache. HD Memory allows applications to efficiently use large amounts of physical memory without being limited by Java garbage collection, enabling predictable scaling and performance by minimizing GC pauses and supporting much larger memory configurations per node.

Health check

A Management Center feature that analyzes member configuration and metrics to detect issues and provide recommendations. Results are categorized by status and can be filtered or downloaded for further analysis.

Hibernate second-level cache

One of the data caching components available in the Hibernate object-relational mapping (ORM) library. Hibernate is a popular ORM library for the Java language, and it lets you store your Java object data in a relational database management system (RDBMS).

IaaS

Infrastructure as a Service is a cloud-based service offering in which the vendor provides compute, network and storage resources and the customer provides the operating system and application software.

IaC

Infrastructure as Code. A method of managing and provisioning IT infrastructure using code, rather than manual processes.

IMDB

In-memory database. A computer system that stores and retrieves data records that reside in a computer’s main memory, e.g., random-access memory (RAM).

IMDG

An in-memory data grid (IMDG) is a data structure that resides entirely in memory and is distributed among many members in a single location or across multiple locations. IMDGs can support thousands of in-memory data updates per second and they can be clustered and scaled in ways that support large quantities of data.

Inference runner

A component in large-scale software systems that lets you plug in machine learning (ML) algorithms (or “models”) to deliver data into those algorithms and calculate outputs.

In-memory computation

Also called in-memory computing, this is the technique of running computer calculations entirely in computer memory (e.g., in RAM).

In-memory processing

The practice of taking action on data entirely in computer memory (e.g., in RAM).

Java heap

The space that Java can reserve and use in memory for dynamic memory allocation. All runtime objects created by a Java application are stored in heap. By default, the heap size is 128 MB, but this limit is reached easily for business applications. Once the heap is full, new objects cannot be created and the Java application shows errors.

Java microservices

A set of software applications written in the Java programming language (and which typically leverage the vast ecosystem of Java tools and frameworks), designed for a limited scope, that work with each other to form a bigger solution.

JCache/Java cache

A de facto standard Java cache API for caching data.

Jet engine

Hazelcast’s stream and batch processing engine that enables distributed, low-latency computation over large volumes of data. The Jet engine models data processing pipelines as directed acyclic graphs (DAGs) and distributes tasks across all available CPU cores in the cluster. It supports both stateless and stateful operations, event time-based windowing, and can process data from various sources to sinks using connectors. The Jet engine is used for building real-time and batch data pipelines, supporting use cases such as analytics, ETL, and event processing.

Job

A data pipeline that’s packaged and submitted to a Hazelcast member for execution. Hazelcast plans it as a directed acyclic graph (DAG) and distributes its tasks across cluster members to run independently of the submitting client. Jobs can be created via the Java API or SQL, and they continue running until canceled or the cluster shuts down. Each job has a unique ID and can be managed via APIs or CLI tools.

JWT

JSON Web Token, an open standard for transmitting information securely between parties as a JSON object.

K8s

Kubernetes. An open source system that manages and deploys containerized applications.

Kappa

The Kappa Architecture is a software architecture used for processing streaming data.

Key-value store

A type of data storage software program that stores data as a set of unique identifiers, each of which have an associated value.

Lambda architecture

A deployment model for data processing that organizations use to combine a traditional batch pipeline with a fast real-time stream pipeline for data access.

Least recently used (LRU)

A cache eviction algorithm where entries are eligible for eviction due to lack of interest by applications.

Least frequently used (LFU)

A cache eviction algorithm where entries are eligible for eviction due to having the lowest usage frequency.

Lite member

A member that does not store data and has no partitions. Lite members are typically used to execute compute-heavy tasks, run Jet jobs, and register listeners while accessing data hosted on data members. They can be promoted to data members (and demoted back) dynamically, which triggers partition rebalancing.

Machine learning (ML) inference

The process of running live data points into a machine learning algorithm (or “ML model”) to calculate an output such as a single numerical score.

Management Center

Hazelcast’s web-based tool for monitoring and managing clusters. It provides real-time dashboards and metrics for members, clients, and data structures; supports SQL browsing (including streaming results); and offers administrative features such as license management, client filtering, and diagnostics. Management Center can also expose clustered JMX and Prometheus metrics and integrates with enterprise authentication and RBAC (role-based access control) in Enterprise Edition.

Member

A Hazelcast instance. Depending on your Hazelcast topology, it can refer to a server or a Java virtual machine (JVM). Members belong to a Hazelcast cluster. Members may also be referred as member nodes, cluster members, Hazelcast members, or data members.

Micro-batch processing

The practice of collecting data in small groups (“batches”) for the purposes of taking action on (processing) that data.

Microservices

A set of software applications designed for a limited scope that work with each other to form a bigger solution.

Microservices architecture

A software architecture approach in which a set of software applications designed for a limited scope, known as microservices, work together to form a bigger solution.

mTLS

Mutual authentication. A method that ensures the authenticity of the parties at each end of a network connection.

Multicast

A type of communication in which data is sent to a defined group of destination members simultaneously (one to many). Distinct from unicast (one to one) and broadcast (one to all).

Near cache

A caching model where an object retrieved from a remote member is put into the local cache and the future requests made to this object will be handled by this local member.

NoSQL

"Not Only SQL". A database model that provides a mechanism for storage and retrieval of data that is tailored in means other than the tabular relations used in relational databases. It is a type of database which does not adhere to the traditional relational database management system (RDMS) structure. It is not built on tables and does not employ SQL to manipulate data. It also may not provide full ACID guarantees, but still has a distributed and fault-tolerant architecture.

OIDC

OpenID Connect provider.

OOME

Out of Memory Error.

Operator

Hazelcast Platform Operator simplifies working with Hazelcast clusters on Kubernetes and Red Hat OpenShift by eliminating the need for manual deployment and life-cycle management.

OSGI

Formerly known as the Open Services Gateway initiative, it describes a modular system and a service platform for the Java programming language that implements a complete and dynamic component model.

PaaS

Platform as a Service is a cloud-based service offering in which the vendor provides hardware resources (as in IaaS), operating systems and management tools, and the customer provides the application software.

Partitioning

A technique used to divide the key space into multiple partitions (shards), each storing a portion of the data, which are replicated across cluster members for scalability, availability, and fault tolerance. A cluster-wide partition table maps partition IDs to the members that own their primary and backup replicas; the oldest (master) member maintains and periodically publishes this table so every member, including lite members, knows where each partition resides, and if the master fails, the next oldest member takes over publishing.

Persistence

A Hazelcast feature that allows map entries, JCache data, streaming job snapshots, and SQL metadata to be stored on disk. Persistence enables individual members or entire clusters to recover data after planned or unplanned shutdowns by loading persisted data from disk, reducing downtime and data loss compared to in-memory storage alone.

PKCE

Proof Key for Code Exchange. An extension used in OAuth 2.0 to improve security for public clients.

Publish/subscribe

A software architecture model by which applications create and share data. Pub/sub is particularly popular in serverless and microservices architectures.

Race condition

This condition occurs when two or more threads can access shared data and they try to change it at the same time.

Raft consensus algorithm

An algorithm used by Hazelcast’s CP Subsystem to achieve strong consistency and linearizability for distributed data structures. Raft coordinates updates among CP members by electing a leader and replicating log entries, ensuring that all changes are committed only when a majority agrees. This approach allows Hazelcast to provide fault-tolerant, strongly consistent operations for coordination and synchronization use cases, even in the presence of failures or network partitions.

Real-time database

A data store designed to collect, process, and/or enrich an incoming series of data points (i.e., a data stream) in real time, typically immediately after the data is created.

Real-time machine learning

The process of training a machine learning model by running live data through it, to continuously improve the model.

Real-time stream processing

The process of taking action on data at the time the data is generated or published.

REST API

Hazelcast’s REST API allows you to access and manage data structures and cluster operations over HTTP/HTTPS protocols. It provides endpoints for data retrieval, cluster and member actions, CP subsystem operations, configuration updates, and more. You can interact with it using tools such as cURL, REST clients, or programming languages with HTTP support. The REST API is disabled by default and must be explicitly enabled in member configuration.

RSA

An algorithm to generate, encrypt and decrypt keys for secure data transmissions.

SAML

Security Assertion Markup Language identity provider (IdP) authenticates users and passes authentication data to a service provider.

Semantic data type

A method of encoding data that allows software to discover and map data based upon its meaning rather than its structure.

Serialization

A process of converting an object into a stream of bytes in order to store the object or transmit it to memory, a database, or a file. Its main purpose is to save the state of an object in order to be able to recreate it when needed.

Sharding

The practice of optimizing database management systems by separating the rows or columns of a larger database table into multiple smaller tables.

Sink

A destination in a data pipeline where processed data is sent for storage or further processing. Sinks act as connectors between Hazelcast and external systems or data stores, allowing results from pipelines to be written to various targets such as files, databases, messaging systems, or Hazelcast data structures. A pipeline must have at least one sink to be valid.

Snapshot

A distributed map that contains the saved state of a job’s computations.

Split-brain

A state in which a cluster of members gets divided (or partitioned) into smaller clusters of members, each of which believes it is the only active cluster.

SSE

Server-sent events.

Streaming data

Also known as real-time data, event data, stream data processing, or data-in-motion, this refers to a continuous flow of information generated by various sources, such as sensors, applications, social media, or other digital platforms.

Streaming database

A data store designed to collect, process, and/or enrich an incoming series of data points (i.e., a data stream) in real time, typically immediately after the data is created.

Streaming ETL (Extract, Transform, Load)

The processing and movement of real-time data from one place to another.

Taxonomy

The practice of classifying and categorizing data.

Time-series database (TSDB)

A computer system that is designed to store and retrieve data records that are part of a “time series,” which is a set of data points that are associated with timestamps. The timestamps provide a critical context for each of the data points in how they are related to others.

Time to live (TTL)

A value that determines how long data is retained, before it is discarded from internal cache.

Transaction

A sequence of information exchange and related work (such as data store updating) that is treated as a unit for the purposes of satisfying a request and for ensuring data store integrity.

TSDB

A time-series database is a computer system that is designed to store and retrieve data records that are part of a “time series” — a set of data points that are associated with timestamps.

Vector search

An advanced information retrieval method that allows systems to go beyond highly organized, quantitative structured data, and capture the context and semantic meaning of qualitative unstructured data that doesn’t follow conventional models, including multimedia, textual, geospatial, and Internet of Things (IoT) data.

WAN Replication

Also known as Geo Replication, a feature that replicates map and cache updates between independent clusters over wide area networks to keep them in sync across data centers. Used for high availability and disaster recovery, it supports Active-Passive and Active-Active modes, buffers and retries during connectivity issues, and allows pausing/resuming and dynamically adding targets.

WSDL

Web Services Description Language.