This is a prerelease version.

View latest

Saving the state of data pipelines

During the processing of data pipelines, the state of the computation process is saved and used by the pipelines. This state is crucial for ensuring the accuracy and integrity of the data processing operations. Jet’s snapshot allows you to save and restore this state.

A snapshot captures the state of a running Jet job at a particular point in time. It allows you to take a consistent record of the in-flight computations and processed data. You can use this for various purposes, such as fault tolerance, job migration, or analysis.

When the Jet engine takes a snapshot, all data in transit and the internal state of the members processing the job is recorded. This means that if the job fails or is restarted, it is restored to the state when the snapshot was taken. This helps to ensure fault-tolerant processing and data integrity.

To export a snapshot in Operator, use the JetJobSnapshot custom resource.

For a worked example, see the Save the state of a Jet job tutorial.

Configure the JetJobSnapshot resource

Configuration options for the JetJobSnapshot custom resource.

Field Description

name

Name of the exported snapshot. If empty, the name of the custom resource is used. You cannot modify this value after the object is created.

cancelName

Whether the job is canceled after exporting a snapshot. The default value is false.

jetJobResourceName

Name of the JetJob CR from which the snapshot is exported.

Export a snapshot

Use the following example configuration to export a snapshot using the JetJobSnapshot custom resource.

Example of JetJobSnapshot configuration
apiVersion: hazelcast.com/v1alpha1
kind: JetJobSnapshot
metadata:
  name: jetjobsnapshot-sample
spec:
  name: snapshot-example (1)
  cancelJob: false
  jetJobResourceName: jet-job-sample (2)
1 Sets the name the exported snapshot.
2 Specifies the name of the JetJob CR object from which the snapshot is exported.
You can only export a snapshot from a Jet job that has a status of Running.

Starting a Jet job initialized from a snapshot

Use the following example to start a new Jet Job that is initialized from a snapshot.

Example of JetJob initialization from Snapshot
apiVersion: hazelcast.com/v1alpha1
kind: JetJob
metadata:
  name: jet-job-sample
spec:
  name: my-test-jet-job
  hazelcastResourceName: hazelcast
  state: Running
  initialSnapshotResourceName: jetjobsnapshot–sample (1)
  jarName: jet-pipeline-1.0.2.jar
  bucketConfig:
    bucketURI: "gs://operator-user-code/jetJobs"
    secretName: br-secret-gcp
1 Specifies the name of the JetJobSnapshot CR object from which the Jet job is initialized.