Query with Flow

Flow focuses on query for data based on its meaning, rather than which system provides it. This allows services to change, and data to move, without requiring consumers to update their queries.

Write queries

Queries are written in TaxiQL, an open source query language for data.

TaxiQL is a powerful query language, and the Taxi documentation has details on the syntax, which we haven’t duplicated here.

TaxiQL is agnostic of where data comes from - it’s left to Flow to discover data from the various sources that have been connected.

Here are some sample queries:

// Find all the movies
find { Movie[] }

// Find all the movies, enriching and projecting them to a different structure
find { Movie[] } as {
   title : MovieTitle
   director : DirectorName
   rating : RottenTomatoesScore
}[]

Projections

Projections are a way of taking data from one place, then transforming and combining it with other data sources.

Flow uses the information present on the object being projected in order to call services and find other information.

For example, you can project from a pre-defined type to another predefined type - such as:

model Purchase {
   transactionId : TransactionId
   customerId : CustomerId
}

model CustomerTransaction {
  transactionId : TransactionId
  name : CustomerName
  price : Price
}

find { Purchases[] } as CustomerTransaction[]

It’s also very common to project from a predefined type to a type defined inline within your query (called 'anonymous types'). For example:

model Purchase {
   transactionId : TransactionId
   customerId : CustomerId
}

find { Purchases[] }
as {
  // Projections let you change field names, and reshape objects as required
  txn: TransactionId
  // Not present on the original Purchase object, so try to
  // find it using something we already know (in this case, the CustomerId)
  customerName: CustomerName
}

Projection scopes

Creating a projection requires the syntax something as { …​ }, which defines a projection scope:

find { Customer } as { // start a projection scope, containing a customer
   // ... omitted ...
}

If you’re projecting an array, then each item within the array is projected separately:

find { Customer[] } as { // start a projection scope.
   // Within the scope, each item is an individual Customer instance
   firstName : FirstName
}[] // Be sure to include the array marker at the end, as the object is an array.

If you need to, you can assign a name to the item within the scope. This can be useful for nested scopes, or controlling inputs into a function. For example:

find { Customer[] } as (customer:Customer) -> {
   // referencing a field by it's name on Customer
   firstName : upperCase(customer.firstName)
   // referencing as field by it's type on Customer
   lastName : upperCase(customer::LastName)

   // using a variable within a constraint:
   purchases: Purchase[](CustomerId == customer::CustomerId)
   // without referring to a variable, Age is resolved against all variables in scope
   age : Age
}[]

Or, when using nested scopes:

find { Customer[] } as (customer:Customer) -> {
  name : CustomerName
  orders : Order[] as (order:Order) -> {
     ageAppropriate : Boolean = customer.age >= order.recommendedAge
  }
}

Or, to provide inputs into functions:

model Film {
   title : FilmTitle inherits String
   headliner : ActorId
   cast: Actor[]
}

find { Film[] } as (film:Film) -> {
   title : FilmTitle
   // singleBy selects a single item from
   // an array (film.cast, an array of Actor), that matches a predicate
   // (in this case, there the actor ID is the same as the headliner id)
   star : singleBy(film.cast, (Actor) -> Actor::ActorId, film.headliner) as (actor:Actor) -> {
      name : actor.name
      title : film.title
   }
}[]

Use variables to modify what’s in scope

By default, the scope contains the entire source object. For example:

model Film {
  title : Title
  cast : Actor[]
}

find { Film } as {
   // this scope contains an entire film record
}

You can modify this by specifying the type of the variable in scope:

find { Film } as (Actor[]) -> { // note that film has been removed from the scope...
  title : Title //... therefore title isn't knowable -- this field will return null
  actorName : ActorName
}[]

You can also use functions to further reduce the scope:

find { Film } as (first(Actor[])) -> {
   // Now, the scope only contains a single actor
   headliner : ActorName
} //  We're not projecting an array anymore, so no aray marker here

Finally, if the data defined in the scope isn’t available on the source, Flow triggers a query to find it. For example:

// Define a few models
model Film {
   id : FilmId inherits Int
   title : Title inherits String
}

model Actor {
   name : ActorName inherits String
}

model Cast {
   actors : Actor[]
}

// And some services that return them
service Films {
   operation getFilm():Film
   operation getCast(FilmId):Cast
}

// Here's a query:
find { Film } as (Actor[]) -> {
  actorName : Name
  filmTitle : Title // should be null, as it's out-of-scope on Actor
}[]

In the above query:

  • getFilm() is called, to fetch the Film

  • The projection requests an Actor[] in the scope, which isn’t available, so…​

    • A call to getCast() is made, passing the FilmId to fetch the Actor[]

    • Because it’s an array, each Actor within the array of Actor[] is projected individually

    • actorName is read from the name field on Actor, because the requested field asks for the type Name

    • filmTitle is out-of-scope, so returned as null

Declare multiple variables in scope

In the previous example, we saw that filmTitle was returned as null, because the Film was removed from scope.

To run the same query with Film in scope, simply add it to the projection:

find { Film } as (Film, Actor[]) -> {
  actorName : Name
  filmTitle : Title // Title is now discoverable, as Film is in scope
}[]

Data discovery rules

When projecting, Flow will use information present on the source object to discover data on the target object.

Data can be fetched from a single operation that returns the value, or by invoking a chain of operations to return the value.

Operations with @Id fields on return types

If the result of an operation is an object that exposes an @Id field, then only operations which accept that @Id field as an input will be called. For example:

model Customer {
  @Id
  customerId: CustomerId
  name: CustomerName
}

service CustomerService {
  // Can be called when projecting, because
  // Customer has an @Id of type CustomerId
  findCustomer(CustomerId): Customer

  // Cannot be called when projecting, because
  // Customer has an @Id, and it isn't CustomerName
  findCustomerByName(CustomerName): Customer
}

Operations without @Id fields on return types

If the result of an operation is an object that does not expose an @Id field, then it can be called with any information available.

Fill in nulls

By default, if a service returns a null value, Flow will accept it 'as is'.

However, if a query annotates a field on a projection type with @FirstNotEmpty, Flow will attempt to populate values by invoking operations to populate the missing values.

Flow will execute a search using the other values present on the entity being projected as potential inputs to operations, and build a path to populate the missing values.

Operations are invoked following the standard Data Discovery Rules.

Understand caching in Flow

By default, Flow does not maintain a long-lived cache between operations, but you can add one by configuring an external cache, such as Hazelcast.

Without an external cache, Flow caches operation calls for the lifetime of a query. This prevents the same operation being invoked repeatedly while projecting multiple rows in a result.

When caching, responses are cached for a given operation and set of inputs. If an operation is invoked with different parameters, the cache is not used.

Operations that return an array of results and which return more than 10 values, will not have their responses cached.

Recover from failure

If an operation returns an error while Flow is attempting to execute a query, then it is excluded from being invoked with the same parameters again. This exclusion is scoped to the query only, and expires at the end of the query.

After excluding the operation, Flow will attempt to find another path to return the value being discovered.

Expressions in queries

Taxi allows the definition of expressions on both types and fields, but doesn’t provide an evaluation engine - that’s where Flow comes in.

Typically, expressions are used in a projection within a query.

You can also use them on a model to expose derived information when a model is parsed by Flow (e.g., when returned from a service), but that’s less common. So, while this documentation focuses on query projections, you can do everything here on a model too.

Write an expression in a projection

Expressions can be defined in the fields of a projected result from a query:

find { Flights[] }
as {
  flightNumber : FlightNumber
  totalSeatsAvailable : TotalSeats
  soldSeats : SoldSeats
  remainingSeats : Int = (this.totalSeatsAvailable - this.soldSeats)
}

Expressions can be defined in two ways: on a field, or on a type.

Expressions on a field

// Expression types on a field:
find { Flights[] }
as {
  flightNumber : FlightNumber
  totalSeatsAvailable : TotalSeats
  soldSeats : SoldSeats
  // field expressions can be defined EITHER using field references...
  remainingSeats : Int = (this.totalSeatsAvailable - this.soldSeats)
  // ...or type references...
  remainingSeats : Int = (TotalSeats - SoldSeats)
}

Expressions on a type

To encapsulate common expressions, you can define a type with the expression:

// Expression type:
type RemainingSeats = TotalSeats - SoldSeats

// Which is then used on a projection:
find { Flights[] }
as {
  flightNumber : FlightNumber
  totalSeatsAvailable : TotalSeats
  soldSeats : SoldSeats
  remainingSeats : RemainingSeats
}

Unlike field expressions, type expressions cannot use field names, and can only reference other types.

How Flow discovers values to evaluate expressions

When Flow is evaluating an expression, it first looks on the source object being projected for the input values into the expression.

If any inputs are not available, then Flow will perform a search using the current data available on the source object in an attempt to look up the value.

Submit queries

Generally, developers will use the UI to write and test their queries, then integrate using Flow’s REST API.

REST API

Queries to Flow are submitted to the /api/taxiql endpoint:

curl 'http://localhost:9021/api/taxiql' \
  -H 'Content-Type: application/taxiql' \
  --data-raw 'find { Movie[] }'

A word about content type

Strictly speaking, the content type for TaxiQL queries is application/taxiql. However, the Flow server will accept TaxiQL queries with any of the following content types headers:

  • Content-Type: application/json

  • Content-Type: application/taxiql

  • Content-Type: text/plain

This is to allow broad compatibility with clients.

Large queries with server-sent events

Running large queries can result in out-of-memory errors if Flow is holding the result set in memory.

To address this, Flow supports pushing results over server-sent events. To consume a query as a server-sent event, set the Accept header to text/event-stream:

curl 'http://localhost:9021/api/taxiql' \
  -H 'Accept: text/event-stream' \
  -H 'Content-Type: application/taxiql' \
  --data-raw 'find { Movie[] }'

Results are pushed out from Flow as they are available.

Include type metadata in responses

Flow can include type metadata in the responses being sent back.

To enable this, append ?resultMode=TYPED to the API call:

curl 'http://localhost:9021/api/taxiql?resultMode=TYPED' \
  -H 'Accept: text/event-stream' \
  -H 'Content-Type: application/taxiql' \
  --data-raw 'find { Movie[] }'

Define output formats

By default, Flow serves results to queries as JSON.

This can be configured to customize the result format.

With Accept headers

The following Accept headers are supported:

Header Result type

application/json

JSON

application/csv

CSV

text/event-stream

JSON with server-sent events

Control output formats

By default, data is written in JSON format. However, this can be controlled by placing an annotation on the model defining the output of a query.

For example:

import flow.formats.Csv

@Csv(delimiter = "|", nullValue = "NULL")
model Person {
   firstName : FirstName inherits String
   lastName : LastName inherits String
   age : Age inherits Int
}

// Query:
// Response type (Person) contains a CSV format defined,
// which will be considered when writing responses.
find { Customer[] }
as { Person[] }

The following formats are supported:

Continue reading

Continue learning about Flow by setting up your production environment.