Configuring Data Connections to External Systems

A data connection contains the metadata that Hazelcast needs to connect an external system. You can define a data connection in your members' configuration files, in the Java member API, or in SQL, and reuse the same connection details in the Pipeline API, SQL mappings, and MapStores.

Quickstart Configuration

To configure a data connection to an external system, you must do the following:

Provide a unique identifier (name) for the data connection.
Choose the correct type of data connection for your external system.

See examples for other connection properties for each data connection type.

Example JDBC Data Connection

This example configuration shows a data connection to a MySQL database using a JDBC connection.

XML
YAML
Java
SQL

<hazelcast>
  <data-connection name="my-mysql-database">
    <type>JDBC</type>
    <properties>
      <property name="jdbcUrl">jdbc:mysql://mysql.example.org:3306</property> (1)
      <property name="user">my_user</property> (2)
      <property name="password">my_password</property>
    </properties>
    <shared>true</shared>
  </data-connection>
</hazelcast>

1	(Required) JDBC URL for establishing a connection to the MySQL database
2	(Optional) Separate user credentials for authentication. Alternatively, you can include the user credentials in the JDBC URL.

hazelcast:
  data-connection:
    my-mysql-database:
      type: JDBC
      properties:
        jdbcUrl: jdbc:mysql://mysql.example.org:3306 (1)
        user: my_user (2)
        password: my_password
      shared: true

1	(Required) JDBC URL for establishing a connection to the MySQL database
2	(Optional) Separate user credentials for authentication. Alternatively, you can include the user credentials in the JDBC URL.

config
  .addDataConnectionConfig(
    new DataConnectionConfig("my-mysql-database")
      .setType("JDBC")
      .setProperty("jdbcUrl", "jdbc:mysql://mysql.example.org:3306") (1)
      .setProperty("user", "my_user") (2)
      .setProperty("password", "my_password")
      .setShared(true)
  );

1	(Required) JDBC URL for establishing a connection to the MySQL database
2	(Optional) Separate user credentials for authentication. Alternatively, you can include the user credentials in the JDBC URL.

Data connections created in SQL behave differently to those defined in members' configuration files or in Java.

To retain SQL-defined data connections after a cluster restart, you must enable SQL metadata persistence. This feature is available in the Enterprise Edition.
You can create or drop a data connection using SQL commands. To update a data connection, you need to drop and then recreate it.

CREATE DATA CONNECTION my_mysql_database
TYPE JDBC
SHARED
OPTIONS (
    'jdbcUrl'='jdbc:mysql://mysql.example.org:3306', (1)
    'user'='my_user', (2)
    'password'='my_password');

1	(Required) JDBC URL for establishing a connection to the MySQL database
2	(Optional) Separate user credentials for authentication. Alternatively, you can include the user credentials in the JDBC URL.

For shared connections, the following optional properties allow you to control how the JDBC connection pool behaves. All other properties are passed directly to the JDBC driver.

Default values are in milliseconds except for where indicated.

Property Default Value Description

Property	Default Value	Description
`connectionTimeout`	`30000` (30 seconds)	Maximum time that Hazelcast waits for a connection from the pool when attempting to connect to the external system, typically a database. A SQL exception is thrown when the `connectionTimeout` is exceeded before a connection becomes available.
`idleTimeout`	`600000` (10 minutes)	Maximum time that the JDBC connection component allows a connection to sit idle in the connection pool. After the `idleTimeout` is exceeded, there may be a time lag of up to +30 seconds before the connection is retired. Although, the average time lag is +15 seconds. Only applies when `minimumIdle` is less than `maximumPoolSize`. A value of `0` means that idle connections are never removed from the pool. The minimum allowed value is `1000` ms (10 seconds)
`keepaliveTime`	`0` (Disabled)	How frequently the JDBC connection component attempts to keep an idle connection alive. When the `keepaliveTime` is reached, the idle connection is removed from the connection pool, pinged, and then returned to the pool. The minimum allowed `keepaliveTime` value is 30000 ms (30 seconds), but a value in the range of minutes is recommended.
`maxLifetime`	`1800000` (30 minutes)	Maximum lifetime of a connection in the pool. Only closed connections are removed. A value of `0` indicates an infinite lifetime, unless an `idleTimeout` value is set and reached. The minimum allowed `maxLifetime` value is 30000 ms (30 seconds). We recommend setting this value, and it should be several seconds shorter than any database or infrastructure imposed connection time limit.
`minimumIdle`	Same as `maximumPoolSize`	Minimum number of idle connections that the JDBC connection component attempts to maintain in the connection pool. When the number of idle connections dips below the `minimumIdle`, and the total connections are less than `maximumPoolSize`, the connection component attempts to add more connections. For maximum performance and responsive we do not recommend setting this value, instead use a fixed size connection pool.
`maximumPoolSize`	`10` (connections)	Maximum size of the connection pool, which includes both idle and active connections. The `maximumPoolSize` sets the maximum number of connections to the external system. When the pool reaches this size, and no more idle connections are available, calls to get a new connection are blocked for up to the `connectionTimeout` before timing out.

connectionTimeout

30000 (30 seconds)

Maximum time that Hazelcast waits for a connection from the pool when attempting to connect to the external system, typically a database. A SQL exception is thrown when the connectionTimeout is exceeded before a connection becomes available.

idleTimeout

600000 (10 minutes)

Maximum time that the JDBC connection component allows a connection to sit idle in the connection pool. After the idleTimeout is exceeded, there may be a time lag of up to +30 seconds before the connection is retired. Although, the average time lag is +15 seconds.

Only applies when minimumIdle is less than maximumPoolSize.
A value of 0 means that idle connections are never removed from the pool.
The minimum allowed value is 1000 ms (10 seconds)

keepaliveTime

0 (Disabled)

How frequently the JDBC connection component attempts to keep an idle connection alive. When the keepaliveTime is reached, the idle connection is removed from the connection pool, pinged, and then returned to the pool. The minimum allowed keepaliveTime value is 30000 ms (30 seconds), but a value in the range of minutes is recommended.

maxLifetime

1800000 (30 minutes)

Maximum lifetime of a connection in the pool. Only closed connections are removed. A value of 0 indicates an infinite lifetime, unless an idleTimeout value is set and reached. The minimum allowed maxLifetime value is 30000 ms (30 seconds).

We recommend setting this value, and it should be several seconds shorter than any database or infrastructure imposed connection time limit.

minimumIdle

Same as maximumPoolSize

Minimum number of idle connections that the JDBC connection component attempts to maintain in the connection pool. When the number of idle connections dips below the minimumIdle, and the total connections are less than maximumPoolSize, the connection component attempts to add more connections.

For maximum performance and responsive we do not recommend setting this value, instead use a fixed size connection pool.

maximumPoolSize

10 (connections)

Maximum size of the connection pool, which includes both idle and active connections. The maximumPoolSize sets the maximum number of connections to the external system. When the pool reaches this size, and no more idle connections are available, calls to get a new connection are blocked for up to the connectionTimeout before timing out.

Example Kafka Data Connection

This example shows the configuration of a data connection to a single Kafka broker.

XML
YAML
Java
SQL

<hazelcast>
  <data-connection name="my-kafka">
    <type>Kafka</type>
    <properties>
      <property name="bootstrap.servers">127.0.0.1:9092</property> (1)
      <property name="key.deserializer">org.apache.kafka.common.serialization.IntegerDeserializer</property> (2)
      <property name="key.serializer">org.apache.kafka.common.serialization.IntegerSerializer</property>
      <property name="value.serializer">org.apache.kafka.common.serialization.StringSerializer</property>
      <property name="value.deserializer">org.apache.kafka.common.serialization.StringDeserializer</property>
      <property name="auto.offset.reset">earliest</property> (3)
    </properties>
    <shared>true</shared>
  </data-connection>
</hazelcast>

1	(Required) Address of the Kafka consumer/producer
2	(Optional) Automatic serializers/deserializers for keys and values in Kafka messages
3	(Optional) Consumer behavior if the connection is interrupted

hazelcast:
  data-connection:
    my-kafka:
      type: Kafka
      properties:
        bootstrap.servers: 127.0.0.1:9092 (1)
        key.deserializer: org.apache.kafka.common.serialization.IntegerDeserialize (2)
        key.serializer: org.apache.kafka.common.serialization.IntegerSerializer
        value.serializer: org.apache.kafka.common.serialization.StringSerializer
        auto.offset.reset: earliest (3)
      shared: true

1	(Required) Address of the Kafka consumer/producer
2	(Optional) Automatic serializers/deserializers for keys and values in Kafka messages
3	(Optional) Consumer behavior if the connection is interrupted

config
  .addDataConnectionConfig(
    new DataConnectionConfig("my-kafka")
      .setType("Kafka")
      .setProperty("bootstrap.servers", "127.0.0.1:9092") (1)
      .setProperty("key.deserializer", "org.apache.kafka.common.serialization.IntegerDeserialize") (2)
      .setProperty("key.serializer", "org.apache.kafka.common.serialization.IntegerSerializer")
      .setProperty("value.serializer", "org.apache.kafka.common.serialization.StringSerializer")
      .setProperty("auto.offset.reset", "earliest") (3)
      .setShared(true)
  );

1	(Required) Address of the Kafka consumer/producer
2	(Optional) Automatic serializers/deserializers for keys and values in Kafka messages
3	(Optional) Consumer behavior if the connection is interrupted

Data connections created in SQL behave differently to those defined in members' configuration files or in Java.

To retain SQL-defined data connections after a cluster restart, you must enable SQL metadata persistence. This feature is available in the Enterprise Edition.
You can create or drop a data connection using SQL commands. To update a data connection, you need to drop and then recreate it.

CREATE DATA CONNECTION my_kafka
TYPE Kafka
SHARED
OPTIONS (
    'bootstrap.servers'='127.0.0.1:9092', (1)
    'key.deserializer'='org.apache.kafka.common.serialization.IntegerDeserialize', (2)
    'key.serializer'='org.apache.kafka.common.serialization.IntegerSerializer',
    'value.serializer'='org.apache.kafka.common.serialization.StringSerializer',
    'auto.offset.reset'='earliest'); (3)

1	(Required) Address of the Kafka consumer/producer
2	(Optional) Automatic serializers/deserializers for keys and values in Kafka messages
3	(Optional) Consumer behavior if the connection is interrupted

Example MongoDB Data Connection

This example configuration shows data connections to two MongoDB databases.

As in the example, you can supply authentication credentials to a MongoDB instance as part of the connection string, or separately.

XML
YAML
Java
SQL

<hazelcast>
  <data-connection name="my-mongodb">
    <type>Mongo</type>
    <properties>
      <property name="connectionString">mongodb://my_user:my_password@some-host:27017</property> (1)
      <property name="database">my_database</property> (2)
    </properties>
    <shared>true</shared>
  </data-connection>
  <data-connection name="my-other-mongodb">
    <type>Mongo</type>
    <properties>
      <property name="host">some_host</property> (3)
      <property name="username">my_user</property> (4)
      <property name="password">my_password</property>
      <property name="database">my_other_database</property> (2)
    </properties>
    <shared>true</shared>
  </data-connection>
</hazelcast>

1	(Required) Connection string of the MongoDB instance, including user credentials
2	(Optional) Name of the database to connect to
3	(Optional) Host details of the MongoDB instance, excluding user credentials
4	(Optional) User credentials for the MongoDB instance

hazelcast:
  data-connection:
    my-mongodb:
      type: Mongo
      properties:
        connectionString: mongodb://my_user:my_password@some-host:27017 (1)
        database: my_database (2)
      shared: true
    my-other-mongodb:
      type: Mongo
      properties:
        host: some_host (3)
        username: my_user (4)
        password: my_password
        database: my_other_database (2)
      shared: true

1	(Required) Connection string of the MongoDB instance, including user credentials
2	(Optional) Name of the database to connect to
3	(Optional) Host details of the MongoDB instance, excluding user credentials
4	(Optional) User credentials for the MongoDB instance

config
  .addDataConnectionConfig(
    new DataConnectionConfig("my-mongodb")
      .setType("Mongo")
      .setProperty("connectionString", "mongodb://my_user:my_password@some-host:27017") (1)
      .setProperty("database", "my_database") (2)
      .setShared(true)
  )
  .addDataConnectionConfig(
    new DataConnectionConfig("my-other-mongo")
      .setType("Mongo")
      .setProperty("host", "some-host") (3)
      .setProperty("username", "my_user") (4)
      .setProperty("password", "my_password")
      .setProperty("database", "my_other_database") (2)
      .setShared(true)
  );

1	(Required) Connection string of the MongoDB instance, including user credentials
2	(Optional) Name of the database to connect to
3	(Required) Host details of the MongoDB instance, excluding user credentials
4	(Optional) User credentials for the MongoDB instance

Data connections created in SQL behave differently from those defined in members' configuration files or in Java.

To retain SQL-defined data connections after a cluster restart, you must enable SQL metadata persistence. This feature is available in the Enterprise Edition.
You can create or drop a data connection using SQL commands. To update a data connection, you need to drop and then recreate it.

CREATE DATA CONNECTION my_mongodb
TYPE Mongo
SHARED
OPTIONS (
    'connectionString'='mongodb://my_user:my_password@some-host:27017', (1)
    'database'='my_database'); (2)

1	(Required) Connection string of the MongoDB instance, including user credentials
2	(Optional) Name of the database to connect to

CREATE DATA CONNECTION my_mongodb
TYPE Mongo
SHARED
OPTIONS (
    'host'='some-host', (1)
    'username'='my_user', (2)
    'password'='my_password'
    'database'='my_other_database');

1	(Required) Host details of the MongoDB instance, excluding user credentials
2	(Optional) User credentials for the MongoDB instance

Property Default value Description When to use

Property	Default value	Description	When to use
`connectionString`		The connection string used to connect to given MongoDB instance. More in Mongo documentation.	Can be omitted if you prefer to provide `host`,
`databaseName`		Name of the database that can be accessed via this connection. If omitted, user will have access to all databases available to given Mongo user.	If you want to restrict usage of this Data Connection to a particular database. Mandatory if you want to use this Data Connection with GenericMapStore.
`username`		Username used to authenticate to Mongo.	If you want to avoid putting credentials in connection string, you can use this dedicated option.
`password`		Password used to authenticate to Mongo.	If you want to avoid putting credentials in connection string, you can use this dedicated option.
`authDb`	`admin`	Authentication database - the database that holds user’s data. It is not the same as the `databaseName` option, both options can specify other databases - one to authenticate, other to actually read.	If you want to use `username` and `password` options, the `authDb` option is the way to configure authentication database name (which can be contained in `connectionString`).
`host`	Host to which Hazelcast will connect. Exclusive with `connectionString`.	If you want to use `username` and `password` options, the `host` option is the way to configure host name.
`connectionPoolMinSize`	10	Sets the minimum size of MongoClient’s internal connection pool.	If you want to control connection pool size
`connectionPoolMaxSize`	10	Sets the maximum size of MongoClient’s internal connection pool.	If you want to control connection pool size
`enableSsl`		Enables SSL support for Mongo client. Default value is `false`.
`invalidHostNameAllowed`		Allowes invalid hostnames in SSL. Default value is `false`. See Mongo docs for more info.
`keyStore`		Location of the key store, file must be present on all members.
`keyStoreType`		Type of the used key store. Defaults to system default.
`keyStorePassword`		Password to the key store.
`trustStore`		Location of the trust store, file must be present on all members.
`trustStoreType`		Type of the used trust store. Defaults to system default.
`trustStorePassword`		Password to the trust store.

connectionString

The connection string used to connect to given MongoDB instance. More in Mongo documentation.

Can be omitted if you prefer to provide host,

databaseName

Name of the database that can be accessed via this connection. If omitted, user will have access to all databases available to given Mongo user.

If you want to restrict usage of this Data Connection to a particular database. Mandatory if you want to use this Data Connection with GenericMapStore.

username

Username used to authenticate to Mongo.

If you want to avoid putting credentials in connection string, you can use this dedicated option.

password

Password used to authenticate to Mongo.

If you want to avoid putting credentials in connection string, you can use this dedicated option.

authDb

admin

Authentication database - the database that holds user’s data. It is not the same as the databaseName option, both options can specify other databases - one to authenticate, other to actually read.

If you want to use username and password options, the authDb option is the way to configure authentication database name (which can be contained in connectionString).

host

Host to which Hazelcast will connect. Exclusive with connectionString.

If you want to use username and password options, the host option is the way to configure host name.

connectionPoolMinSize

Sets the minimum size of MongoClient’s internal connection pool.

If you want to control connection pool size

connectionPoolMaxSize

Sets the maximum size of MongoClient’s internal connection pool.

If you want to control connection pool size

enableSsl

Enables SSL support for Mongo client. Default value is false.

invalidHostNameAllowed

Allowes invalid hostnames in SSL. Default value is false. See Mongo docs for more info.

keyStore

Location of the key store, file must be present on all members.

keyStoreType

Type of the used key store. Defaults to system default.

keyStorePassword

Password to the key store.

trustStore

Location of the trust store, file must be present on all members.

trustStoreType

Type of the used trust store. Defaults to system default.

trustStorePassword

Password to the trust store.

Example Hazelcast Data Connection

This example configuration shows a data connection to a remote Hazelcast cluster. You can use a Hazelcast data connection from the Pipeline API in Sources#remoteMapJournal source.

Currently, no SQL connector is available for Hazelcast data connections. This means that although you can create a data connection in SQL, you cannot yet use it in SQL, for example, in a mapping statement.

XML
YAML
Java
SQL

<hazelcast>
  <data-connection name="my-remote-hazelcast">
    <type>Hz</type>
    <properties>
      <property name="client_xml"> (1)
        <![CDATA[ (2)
          <?xml version="1.0" encoding="UTF-8"?>
          <hazelcast-client xmlns="http://www.hazelcast.com/schema/client-config"
              xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
              xsi:schemaLocation="http://www.hazelcast.com/schema/client-config
              http://www.hazelcast.com/schema/client-config/hazelcast-client-config-5.4.xsd">

            <cluster-name>dev</cluster-name>
              <network>
                <cluster-members>
                  <address>172.17.0.2:5701</address>
                </cluster-members>
              </network>
          </hazelcast-client>
        ]]>
      </property>
    </properties>
    <shared>true</shared>
  </data-connection>
</hazelcast>

1	(Required) Specify exactly one of `client_xml`, `client_yml`, `client_xml_path`, `client_yml_path`.
2	Hazelcast client configuration to connect to a remote cluster. See Configuring Java Client. You can specify an external file with the `client_xml_path` property instead of using an embedded configuration file.

hazelcast:
  data-connection:
    my-remote-hazelcast:
      type: Hz
      properties:
        client_yml: | (1)
          hazelcast-client: (2)
            cluster-name: dev
            network:
              cluster-members:
                - 172.17.0.2:5701
      shared: true

1	(Required) Specify exactly one of `client_xml`, `client_yml`, `client_xml_path`, `client_yml_path`.
2	Hazelcast client configuration to connect to a remote cluster. See Configuring Java Client. You can specify an external file with the `client_yml_path` property instead of using an embedded configuration file.

config
  .addDataConnectionConfig(
    new DataConnectionConfig()
      .setName("my-remote-hazelcast")
      .setType("Hz")
      .setProperty(
              "client_xml", (1)
              "<?xml version=\"1.0\" encoding=\"UTF-8\"?>" + (2)
              "<hazelcast-client xmlns=\"http://www.hazelcast.com/schema/client-config\"" +
              "      xmlns:xsi=\"http://www.w3.org/2001/XMLSchema-instance\"" +
              "      xsi:schemaLocation=\"http://www.hazelcast.com/schema/client-config" +
              "      http://www.hazelcast.com/schema/client-config/hazelcast-client-config-5.4.xsd\">" +
              "    " +
              "    <cluster-name>dev</cluster-name>" +
              "    <network>" +
              "        <cluster-members>" +
              "            <address>172.17.0.2:5701</address>" +
              "        </cluster-members>" +
              "    </network>" +
              "</hazelcast-client>")
      .setShared(true)
  );

1	(Required) Specify exactly one of `client_xml`, `client_yml`, `client_xml_path`, `client_yml_path`.
2	Hazelcast client configuration to connect to a remote cluster. See Configuring Java Client. You can specify an external file with the `client_xml_path` or `client_yml_path` property instead of using an embedded configuration file.

Data connections created in SQL behave differently to those defined in members' configuration files or in Java.

To retain SQL-defined data connections after a cluster restart, you must enable SQL metadata persistence. This feature is available in the Enterprise Edition.
You can create or drop a data connection using SQL commands. To update a data connection, you need to drop and then recreate it.

CREATE DATA CONNECTION "my-hazelcast-cluster"
TYPE Hz
SHARED
OPTIONS (
    'client_xml'= (1)
    '<?xml version="1.0" encoding="UTF-8"?> (2)
     <hazelcast-client xmlns="http://www.hazelcast.com/schema/client-config"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://www.hazelcast.com/schema/client-config
         http://www.hazelcast.com/schema/client-config/hazelcast-client-config-5.4.xsd">

       <cluster-name>dev</cluster-name>
       <network>
         <cluster-members>
	   <address>172.17.0.2:5701</address>
         </cluster-members>
       </network>
      </hazelcast-client>'
);

1	(Required) Specify exactly one of `client_xml`, `client_yml`, `client_xml_path`, `client_yml_path`.
2	Hazelcast client configuration to connect to a remote cluster. See Configuring Java Client. You can specify an external file with the `client_xml_path` or `client_yml_path` property instead of using an embedded configuration file.

Configuration Options for Data Connections

Data connections have the following configuration options.

If you are using Java to configure the Mapstore, use the DataConnectionConfig object.

Table 1. Data connection configuration options
Option	Description
Default	Example
`name` (required)	The unique identifier for the data connection.
`type` (required)	The type of data connection required for your external system. The following types of connection are supported: `JDBC`, `Kafka`, `Mongo`, `Hz` (case-insensitive).
`properties`	Any configuration properties that the data connection expects to receive.
`shared`	Whether the data connection instance is reusable in different MapStores, jobs, and SQL mappings. This behavior depends on the implementation of the specific data connection. The default value is `true`. See the implementation of each data connection type for full details of reusability: `KafkaDataConnection`, `MongoDataConnection`, `HazelcastDataConnection`.

You can use the EncryptionReplacer class to hide sensitive data in configuration files.

Types of Data Connection

The following types of data connection are available for use.

Type Description Properties

Type	Description	Properties
`JDBC`	Connect to external systems that support JDBC, including MySQL and PostgreSQL.	See example. If there is more than one JDBC connection used on a single member from a single job, they will share the same data store and connection pool.
`Kafka`	Connect to a Kafka data source.	See example and Create a Kafka Mapping.
`Mongo`	Connect to a MongoDB database.	See example.
`Hz`	Connect to a remote Hazelcast cluster.	See example.

JDBC

Connect to external systems that support JDBC, including MySQL and PostgreSQL.

See example. If there is more than one JDBC connection used on a single member from a single job, they will share the same data store and connection pool.

Kafka

Connect to a Kafka data source.

See example and Create a Kafka Mapping.

Mongo

Connect to a MongoDB database.

See example.

Hz

Connect to a remote Hazelcast cluster.

See example.

If you use the slim distribution of Hazelcast with a built-in data connector, make sure that you have an appropriate driver on your cluster’s classpath.

Next Steps

Use your configured connection:

Build a data pipeline with the Pipeline API.
Query your data connection, using a SQL mapping.
Build a cache with a MapStore.

Configuring Data Connections to External Systems

Quickstart Configuration

Example JDBC Data Connection

Example Kafka Data Connection

Example MongoDB Data Connection

Example Hazelcast Data Connection

Configuration Options for Data Connections

Types of Data Connection

Next Steps

Send us your feedback

Help and support