Ingesting Data from Sources
In order to process data, you first need to ingest it into your pipeline, using one or more sources.
In Hazelcast, each source defines which type of processing it supports:
Stream source: Reads from an infinite source that never ends.
Batch source: Reads from a finite source.
Stream sources can have either an at-least-once or exactly-once processing guarantee in case of a member failure. But, batch sources do not support these guarantees, instead pipelines that use these sources should just be restarted in case of a member failure.
If a pipeline pulls data from multiple sources, you can use both stream sources and batch sources in the same pipeline. And, you can use the following joining transforms to merge the branches:
For example, you can merge data from two sources into one by using the
Pipeline pipeline = Pipeline.create(); BatchSource<String> leftSource = TestSources.items("the", "quick", "brown", "fox"); BatchSource<String> rightSource = TestSources.items("jumps", "over", "the", "lazy", "dog"); BatchStage<String> left = pipeline.readFrom(leftSource); BatchStage<String> right = pipeline.readFrom(rightSource); left.merge(right) .writeTo(Sinks.logger());