Configure connections

Overview

The connections.conf file defines connections for systems that Flow understands how to connect to.

A connection is required for most - but not all - data sources (eg., Databases, AWS services, etc.). HTTP connections (OpenAPI / SOAP, etc) are generally self-descriptive enough not to require additional metadata.

Declare in your Taxi project

Connections are defined in a connections.conf, which lives inside a Taxi project within your workspace.

Inside your taxi.conf file in your project, declare where you’re keeping your config for Flow.

By convention, this lives at flow/config/connections.conf, but this is configurable.

name: com.myproject/demo
version: 0.1.0
sourceRoot: src/

additionalSources: {
  "@flow/config": "flow/config/*.conf"
}

Your connections config file is a HOCON file, specifying how to connect to various resources.

For example:

jdbc {
  orders {
    connectionName = orders
    connectionParameters {
      database = ordersdb
      host = localhost
      username = bradpitt
      password = ${superSecretPassword}
      port = 5432

   }
   jdbcDriver = POSTGRES
 }
}

Pass sensitive data

It may not always be desirable to specify sensitive connection information directly in the config file, especially if these are being checked into source control.

Environment variables can be used anywhere in the config file, following the HOCON standards.

For example:

jdbc {
   another-connection {
      connectionName = another-connection
      jdbcDriver = POSTGRES
      connectionParameters {
         // .. other params omitted for brevity ..
         password = ${postgres_password} // Reads the environment variable "postgres_password"
      }
   }
}
See also Manage Secrets to see how to securely manage sensitive information in your connections.

Handle substitutions correctly in URLs

Substitution of variable inside a URL using HOCON can be tricky.

In short, here’s how substitutions need to be defined:

query-server {
   // The Url has specifal characters (:), so needs to be inside of quotes.
   // However, variable substitution doesn't work inside of quotes,
   // so the variable must be outside of quotes.
   url="http://"${MY_VARIABLE}":9305"
}

For more information on substitutions, see HOCON library documentation.

Database connections

Database connections are defined under the jdbc element within the connections.conf file.

jdbc { // The root element for database connections
   another-connection { // Defines a connection called "another-connection"
      connectionName = another-connection // The name of the connection.  Must match the key used above.
      jdbcDriver = POSTGRES // Defines the driver to use.  See below for the possible options
      connectionParameters { // A list of connection parameters.  The actual values here are defined by the driver selected.
         database = transactions // The name of the database
         host = our-db-server // The host of the database
         password = super-secret // The password
         port = "2003" // The port
         username = jack // The username to connect with
      }
   }
}

Supported drivers

Postgres

To configure a Postgres connection, specify jdbcDriver = POSTGRES

Connection parameters are as follows:

Parameter name Description

host

The host address of the Postgres database

port

The port to connect to. Defaults to 5432

database

The name of the database on the Postgres server

username

Optional. The username to use when connecting

password

Optional. The password to use when connecting

Example

jdbc { // The root element for database connections
   another-connection { // Defines a connection called "another-connection"
      connectionName = another-connection // The name of the connection.  Must match the key used above.
      jdbcDriver = POSTGRES // Defines the driver to use.  See below for the possible options
      connectionParameters { // A list of connection parameters.  The actual values here are defined by the driver selected.
         database = transactions // The name of the database
         host = our-db-server // The host of the database
         password = super-secret // The password
         port = "2003" // The port
         username = jack // The username to connect with
      }
   }
}

MySQL

To configure a MySQL connection, specify jdbcDriver = MYSQL

Connection parameters are as follows:

Parameter name Description

host

The host address of the MySQL database

port

The port to connect to. Defaults to 3306

database

The name of the database on the MySQL server

username

Optional. The username to use when connecting

password

Optional. The password to use when connecting

Example

jdbc {
    mysql-db {
        connectionName=mysql-db
        connectionParameters {
            database=test
            host=localhost
            password=my-secret-pw
            port="3306"
            username=root
        }
        jdbcDriver=MYSQL
    }
}

MSSQL Server

To configure a MSSQL connection, specify jdbcDriver = MSSQL

Connection parameters are as follows:

Parameter name Description

host

The host address of the MSSQL database

port

The port to connect to. Defaults to 1443

database

The name of the database on the MS SQL server

username

Optional. The username to use when connecting

password

Optional. The password to use when connecting

schema

Optional. The schema to use - defaults to dbo

trustServerCertificate

Optional. Forces Flow to trust the certificate that’s provided by the SQL Server. Defaults to true

encrypt

Optional. Defines if the connection to MSSQL server should be encrypted. Defaults to true

Example

jdbc {
    sqlServerConnection {
        connectionName=sqlServerConnection
        connectionParameters {
            database=Northwind
            encrypt="true"
            host=localhost
            password=ChangeMe
            port="14330"
            schema=dbo
            trustServerCertificate="true"
            username=sa
        }
        jdbcDriver=MSSQL
    }
}

Redshift

To configure a Redshift connection, specify jdbcDriver = REDSHIFT

Connection parameters are as follows:

Parameter name Description

host

The host address of the Redshift database

port

The port to connect to. Defaults to 5439

database

The name of the database on the Redshift server

username

Optional. The username to use when connecting

password

Optional. The password to use when connecting

Example

jdbc { // The root element for database connections
   another-connection { // Defines a connection called "another-connection"
      connectionName = another-connection // The name of the connection.  Must match the key used above.
      jdbcDriver = REDSHIFT // Defines the driver to use.  See below for the possible options
      connectionParameters { // A list of connection parameters.  The actual values here are defined by the driver selected.
         database = transactions // The name of the database
         host = our-db-server // The host of the database
         password = super-secret // The password
         port = "2003" // The port
         username = jack // The username to connect with
      }
   }
}

Snowflake

To configure a Snowflake connection, specify jdbcDriver = SNOWFLAKE

Connection parameters are as follows:

Parameter name Description

account

The name of the Snowflake account

schema

The name of the schema to connect to

db

The name of the database to connect to

warehouse

The name of the warehouse where the snowflake db exists

username

The username to use when connecting

password

The password to use when connecting

role

The role to specify when connecting

Example

jdbc { // The root element for database connections
   another-connection { // Defines a connection called "another-connection"
      connectionName = another-connection // The name of the connection.  Must match the key used above.
      jdbcDriver = SNOWFLAKE // Defines the driver to use.  See below for the possible options
      connectionParameters { // A list of connection parameters.  The actual values here are defined by the driver selected.
        account = mySnowflakeAccount123.eu-west-1
        schema = public
        db = demo_db
        warehouse = COMPUTE_WH
        schema = public
        role = QUERY_RUNNER
      }
   }
}

Kafka connections

Read about defining Kafka connections in the dedicated documentation for Kafka.

AWS connections

AWS connections are stored under the aws element.

Flow uses AWS connections, for example, to connect to SQS for data pipelines and other services as part of query execution.

Flow will use the AWS default credentials provider by default. This means you can configure the access credentials and region with environment variables (AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY & AWS_DEFAULT_REGION). When running Flow in AWS (e.g. ECS), it’ll also automatically pick up the role used to run the service and use that.

You can also configure the AWS connections manually which can be useful in cases where you need to connect to various different AWS accounts from a single installation of Flow. As with any other config file value, you can either set the value explicitly, or read from an environment variable (as shown).

aws {
    my-aws-account {
        connectionName=my-aws-account
        // Optional Parameter. When not provided Flow will use the https://docs.aws.amazon.com/sdk-for-java/v1/developer-guide/credentials.html#credentials-default[AWS default credentials provider] by default.
        accessKey=${?AWS_ACCESS_KEY_ID}
        // Optional Parameter. When not provided Flow will use the https://docs.aws.amazon.com/sdk-for-java/v1/developer-guide/credentials.html#credentials-default[AWS default credentials provider] by default.
        secretKey=${?AWS_SECRET_ACCESS_KEY}
        // Mandatory
        region=${AWS_REGION}
        // Optional parameter for development and testing purposes to point to a different endpoint such as a LocalStack installation.
        endPointOverride=${?AWS_ENDPOINT_OVERRIDE}
    }
}

Mongo connections

Read about defining Mongo connections in the dedicated documentation for Mongo.

Test with Localstack

You can point Flow at an AWS mock running on Localstack by specifying the endPointOverride property in the connection.

Continue reading

Continue learning about Flow by querying your data.