Mapping to MongoDB
To query MongoDB data connections, you can create a mapping to them with the Mongo connector.
What is the Mongo Connector
The Mongo connector allows you to read from/write to a MongoDB database, and to execute SQL queries on Mongo collections directly from Hazelcast.
Installing the Connector
The Mongo Connector artifacts are published on the Maven repositories.
Add the following lines to your pom.xml
to include it as a dependency to your project:
<dependency>
<groupId>com.hazelcast.jet</groupId>
<artifactId>hazelcast-jet-mongodb</artifactId>
<version>$5.5.2</version>
</dependency>
or if you are using Gradle:
compile group: 'com.hazelcast.jet', name: 'hazelcast-jet-mongodb', version: $5.5.2.
To be able to use SQL over MongoDB, you have to include hazelcast-sql as well as a dependency.
|
Permissions
Enterprise Edition
If security is enabled, your clients may need permissions to use this connector. For details, see Securing Jobs.
Before you Begin
Before you can create a mapping to a MongoDB, you must have the following:
-
A
$jsonSchema
validation in the collection (see the schema documentation), or you have to have at least one element in the collection you want to create mapping for (for property type validation). -
Enabled operations log (oplog, see the Mongo documentation) if you want to use streaming mappings.
Creating a MongoDB Mapping
The following example creates a mapping to a MongoDB database.
-
In a MongoDB database, create a
people
collection. For example in Java, you would run the following command.CreateCollectionOptions options = new CreateCollectionOptions(); ValidationOptions validationOptions = new ValidationOptions(); validationOptions.validator(BsonDocument.parse( "{\n" + " $jsonSchema: {\n" + " bsonType: \"object\",\n" + " title: \"Object Validation\",\n" + " properties: {" + " \"personId\": { \"bsonType\": \"int\" },\n" + " \"name\": { \"bsonType\": \"string\" }\n" + " }\n" + " }\n" + " }\n" )); options.validationOptions(validationOptions); database.createCollection(collectionName, options);
The
ValidationOptions
are not required, but recommended. -
Configure the data connection so that the client can be reused by multiple mappings.
<hazelcast> <data-connection name="myMongo"> <type>Mongo</type> <properties> <property name="connectionString">stringForMongo</property> (1) </properties> <shared>false</shared> (2) </data-connection> </hazelcast>
data-connection: name: myMongo type: Mongo properties: connectionString: stringforMongo (1) shared: false (2)
DataConnectionConfig dataConnectionConfig = new DataConnectionConfig() .setName("myMongo") .setType("Mongo") .setProperty("connectionString", connectionStringToMongo) (1) .setShared(false); (2) config.addDataConnectionConfig(dataConnectionConfig);
1 Your connection string. 2 Set to true
if the connection is reusable.Instead of providing a single
connectionString
parameter, you may also want to provide host, username, password and (optionally)authDb
.Instead of providing the configuration in YAML or XML, you can also run the following SQL query.
CREATE DATA CONNECTION myMongo type Mongo SHARED OPTIONS ( ‘connectionString’ = ‘<your connection string>’ )
-
Create the mapping.
CREATE MAPPING people DATA CONNECTION myMongo; (1)
1 The name of the data connection configuration on your members (see Step 2 above). In the above case, automatic schema inference will be used. You may also want to provide the schema explicitly as shown below.
CREATE MAPPING people ( firstName VARCHAR(100), lastName VARCHAR(100), age INT ) DATA CONNECTION myMongo
Notice that there is no mention of
TYPE MONGO
this time; it’s automatically assumed by the SQL engine when you provide MongoDB data connection. This works with both schema provided or not.
Specify the database name using one of the following options:
|
Supported Object Types
There may be one or more object types for a connector. The Mongo SQL connector has two object types:
-
Collection
: Represents a batch read from a given collection. This is the default object type. -
ChangeStream
: Represents reading a stream of events.
To change the object type, append OBJECT TYPE X
after DATA CONNECTION
/ TYPE
. For example:
CREATE MAPPING people (
firstName VARCHAR(100),
lastName VARCHAR(100),
age INT
)
DATA CONNECTION myMongo
OBJECT TYPE ChangeStream
OPTIONS (...)
CREATE MAPPING people (
firstName VARCHAR(100),
lastName VARCHAR(100),
age INT
)
TYPE Mongo
OBJECT TYPE ChangeStream
OPTIONS (...)
Field Mappings
Object type Collection
resolves columns to the names and types of the properties of a Mongo collection.
For ChangeStream
it is more complicated. The ChangeStream
object contains the MongoDB collection properties, prefixed with fullDocument.
.
There are also some predefined, top-level columns without the fullDocument.
prefix:
-
operationType STRING
: Operation that triggered this change event, e.g.insert
. -
resumeToken STRING
: Resume token associated with the given change stream event. -
wallTime DATE_TIME
: Wall time of the event. The date and time of the database operation on the server. -
ts TIMESTAMP
: Timestamp of the event. This is either equal to the wall time of the event, if provided, or to the current time on the Hazelcast member. -
clusterTime TIMESTAMP
: Cluster time of the event. The timestamp associated with Mongo oplog entry.
For further information, see the Mongo documentation.
Available options
Best way is to configure MongoDB options via Data Connection
. Options available when using Mongo Data Connections are:
There are some options that can be added only in CREATE MAPPING
:
Name |
Description |
|
Defines moment in time from when the event stream should be read. The option is valid only if mapping
has - |
|
Specifies which column should be used as a primary key in the queries. Default value is |
|
Forces queries to be executed exactly on one member in one thread. Can be useful if you want to use, e.g., MongoDB Atlas
serverless free tier, which currently does not support |
|
Configures when Hazelcast will perform database and collection existence checks to warn about non existing database and/or collection. Possible values: - |
Type Mapping
The type system in MongoDB and SQL is not exactly the same. This leads to potential confusion and the need for the type coercion.
BSON Type | SQL Type | Java Type |
---|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
- |
- |
|
- |
- |
|
|
|
|
|
|
This represents seconds from Unix epoch in UTC timezone. Therefore, it’s not mapped to pure |
|
|
|
|
|
|
- |
- |
|
|
|
|
- |
- |
|
|
|
|
|
|
|
- |
- |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
The Java Type column represents an object returned by the SQL query if the object put into the collection is of given BSON type.
Note that, while Hazelcast is able to convert MongoDB type to the requested SQL type in the projection, the argument binding will not always work the same due to technical limitations. For example, you can have an object with the type TIMESTAMP
represented as DATE_TIME
, that after execution of SELECT
it will give you LocalDateTime
in Java client. However, binding LocalDateTime
as an argument will not work, as only native MongoDB types will work for arguments. Same applies to, for example, having BSON column of type STRING
mapped to INTEGER
in SQL.
Type Coercion
The following table shows the possible and supported type coercions. All the default mappings from the previous section are always valid.
Type of Provided Argument | Resolved Insertion Type |
---|---|
|
|
|
|
|
|