Configure Flow

Flow can be configured through a series of config files, passed at runtime.

We designed Flow this way to encourage scripted deployments (as opposed to using a config database, which is hard to script).

Flow uses HOCON format for its config files. (HOCON is a superset of JSON, so JSON is also valid.)

Configuration basics

There are several common themes in how Flow configuration works.

Create reasonable defaults

If you don’t provide a config file, Flow will create one for you on startup, and populate it with reasonable defaults.

Environment variables and sensitive data

Often configuration files need to contain sensitive data (usernames, passwords, etc).

It may not always be desirable to specify sensitive connection information directly in the config file, especially if these are being checked into source control.

Environment variables can be used anywhere in the config file, following the HOCON standards.

For example:

jdbc {
   some-database-connection {
      connectionName = some-database-connection
      # .. other params omitted for brevity ..
      connectionParameters {
         # .. other params omitted for brevity ..
         password = ${postgres_password} # Reads the environment variable "postgres_password"
      }
   }
}

Correctly handle substitutions in URLs

In some config files, you’ll need to specify URLs, and you’ll want to use variables. Using a variable inside a URL using HOCON can be tricky.

In short, here’s how substitutions need to be defined:

query-server {
   // The Url has special characters (:), so needs to be inside of quotes.
   // However, variable substitution doesn't work inside of quotes,
   // so the variable must be outside of quotes.
   url="http://"${MY_VARIABLE}":9305"
}

For more information, see this issue in the HOCON library.

Pass Flow application configuration

There are several configuration settings referenced throughout these docs, which can be used to fine tune how Flow behaves.

All config settings can be passed in a variety of ways:

Container

In a container or Docker Compose file, pass variables using the OPTIONS environment variable:

services:
   flow:
      image: flow/flow:latest
      environment:
         OPTIONS: >-
            --flow.db.username=flow
            --flow.db.password=changeme
            --flow.db.host=postgres

Set as environment variables

Alternatively, any setting can be defined as an environment variable.

Environment variables follow a different convention so, to convert, apply the following:

  • Use uppercase

  • Replace dashes and dots with underscores

For example:

Application property Environment variable

flow.db.username

FLOW_DB_USERNAME

flow.workspace.config-file

FLOW_WORKSPACE_CONFIG_FILE

Configure Git repositories

In production, it’s common to read (and write) from a Git repository, rather than a local file system.

This allows the use of standard Git workflows to promote changes to 'production'.

The schema server will periodically poll a Git repository and branch, and pull in any changes as they’re detected.

Use access tokens

GitHub and GitLab support embedding access tokens within the URL of the Git repository.

Here’s an example:

git {
   checkoutRoot="/some/path"
   repositories=[
      {
         branch=master
         name=my-taxonomy
         uri="https://jimmy:glpat-purple-tortoise@gitlab.com/acme/acme-taxonomy"
      }
   ]
}