This is a prerelease version.

View latest

User-Defined Types in SQL

User-Defined Types (also known as UDTs, nested types, or nested fields) allow you to create custom data types that can be referenced in the CREATE MAPPING statement. UDTs are also useful for creating and accessing hierarchical data structures.

Feature Overview

  • UDTs can only be incorporated into mappings that have the portable, compact, java or avro format, although UDTs themselves are not tied to a specific format.

  • Type options override mapping options. Where both the mapping and type define a schema (Portable class definition, Compact schema, Java class or Avro schema) and the corresponding mapping field is __key or this, the schema defined by the type is used. This means that the mapping does not need to define a schema if the type defines one.

  • If the type does not specify a schema, it is resolved from the parent structure, which may be a mapping or another type.

  • Type fields are optional and if not specified, they are resolved from the class/schema when the type is used in a CREATE MAPPING statement. You can check the resolved fields using a GET_DDL query. The field resolution feature has the following limitations:

    • Once the fields are resolved, they are not updated when the underlying class/schema changes or when the type is used in another mapping.

    • For Portable, Compact, or Avro formats, schema fields cannot be complex (PORTABLE, COMPACT, or RECORD). Otherwise, an exception will be thrown once the type is used in a mapping.

    • For Portable format, portable IDs (portableFactoryId, portableClassId and portableVersion) must be unique within the cluster. Otherwise, deserialization issues can occur.

    • For Portable format, the field resolution relies on the Portable class being already registered (by internal means, such as adding a configuration for Portable serialization) on the member executing the CREATE MAPPING command.

    • For Java format, complex class fields are allowed and mapped to OBJECT.

    • For Java format, only the fields declared by the type class are considered. The fields inherited from superclasses are ignored.

  • Cyclic definitions are only supported for Java-serialized structures. Portable and Compact formats do not support cyclic schemas, and Avro support is currently limited to acyclic schemas.

Creating Types

To create a new type, use the CREATE TYPE statement:

CREATE [OR REPLACE] TYPE [IF NOT EXISTS] MyTypeName [(
    colName colType,
    ...
)] [OPTIONS (
    'javaClass'='com.mypackage.MyJavaClass' (1)
  | 'compactTypeName'='MyCompactRecordName' (2)
  | 'portableFactoryId'='123', 'portableClassId'='456', ['portableVersion'='789'] (3)
  | 'avroSchema'='{"type":"record","name":"myType","fields":[{"name":"colName","type":"colType"},...]}' (4)
)]
1 In java format, you can use the javaClass option to override the type class with a subclass. This is necessary if the original type class is abstract or interface, and you want to use INSERT and UPDATE statements.
2 In portable format, you can use the portableFactoryId, portableClassId and portableVersion options to specify a portable ID for the type. This is only effective when the type is used for __key or this fields. portableVersion defaults to 0 if not specified. Required portable ID components must be defined together; incomplete definitions are ignored.
3 In compact format, you can use the compactTypeName option to specify the Compact record name. If unspecified, it defaults to <TypeName>CompactType.
4 In avro format, you can use the avroSchema option to specify an inline Avro schema. This is only effective when the type is used for __key or this fields.
  1. EXTERNAL NAME aliases are not supported for UDTs; column names must have the same name as their corresponding Java class field or Portable/Compact/Avro schema field.

  2. You can mix options that belong to different formats. When you create a mapping that references your UDT, the relevant options are used and the others are ignored, which makes it possible to use a UDT in multiple mappings having different formats.

Replacing Types and Type Consistency

Currently, there is a limitation on the replacement of existing types: if the replaced type was already used in a mapping, you need to recreate that mapping to update its data type information using the DROP MAPPING and CREATE MAPPING statements. However, if the type hierarchy was not used in a mapping, any type in that hierarchy can be safely changed, and these changes will appear in the new mapping. This is because the links between types are symbolic (based on the name only), and they are materialized only when used in a mapping.

Examples

The following classes are used as a reference in the sections below to create types and mappings:

package com.example;

class User implements Serializable {
    public Long id;
    public String name;
    public Organization organization;
}

class Organization implements Serializable {
    public Long id;
    public String name;
    public Office office;
}

class Office implements Serializable {
    public Long id;
    public String name;
}
The name of a type can differ from the one specified in the Java class or Portable/Compact/Avro schema. However, types must have distinct names within the set of names across all mappings and views as they share the same namespace.
CREATE TYPE Organization (
    id BIGINT
    name VARCHAR,
    office Office
) OPTIONS (
    'javaClass'='com.example.Organization'
);

CREATE TYPE Office (
    id BIGINT
    name VARCHAR
) OPTIONS (
    'javaClass'='com.example.Office'
);

Creating Mappings

The organization column is explicitly specified as Organization to prevent it being auto-resolved as a generic OBJECT, and therefore unable to query its sub-columns.
CREATE MAPPING users (
    __key BIGINT,
    id BIGINT,
    name VARCHAR,
    organization Organization
) TYPE IMap OPTIONS (
    'keyFormat'='bigint',
    'valueFormat'='java',
    'valueJavaClass'='com.example.User'
);

Support for Cycles

When creating a UDT, the existence of referenced types is only verified when the type is used in a CREATE MAPPING statement. This makes it possible to create cyclic types.

Cyclic types are only supported for Java format. However, the support is limited only to querying. Inserting or updating with cyclic types is currently not supported.

Enabling Cycling Type Support

You can enable cyclic types by setting the hazelcast.sql.experimental.custom.cyclic.types.enabled property to true in the member configuration. It is disabled by default.

  • XML

  • YAML

  • Java

<hazelcast>
    <properties>
        <property name="hazelcast.sql.experimental.custom.cyclic.types.enabled">true</property>
    </properties>
</hazelcast>
hazelcast:
  properties:
    hazelcast.sql.experimental.custom.cyclic.types.enabled: true
final Config config = new Config();
config.setProperty("hazelcast.sql.experimental.custom.cyclic.types.enabled", "true");

Creating Cyclic Types

Java classes for reference:

package com.example;

class A implements Serializable {
    public String name;
    public B b;
}
class B implements Serializable {
    public String name;
    public C c;
}
class C implements Serializable {
    public String name;
    public A a;
}

The following commands will create an interlinked type hierarchy:

Order of execution of these commands doesn’t matter.
CREATE TYPE AType (
    name VARCHAR,
    b BType
) OPTIONS (
    'javaClass'='com.example.A'
);

CREATE TYPE BType (
    name VARCHAR,
    c CType
) OPTIONS (
    'javaClass'='com.example.B'
);

CREATE TYPE CType (
    name VARCHAR,
    a AType
) OPTIONS (
    'javaClass'='com.example.C'
);

Using Cyclic Types

CREATE MAPPING tableA (
    __key BIGINT,
    name VARCHAR,
    b BType
) OPTIONS (
    'keyFormat'='bigint',
    'valueFormat'='java',
    'valueJavaClass'='com.example.A'
);

CREATE MAPPING tableB (
    __key BIGINT,
    name VARCHAR,
    c CType
) OPTIONS (
    'keyFormat'='bigint',
    'valueFormat'='java',
    'valueJavaClass'='com.example.B'
);

CREATE MAPPING tableC (
    __key BIGINT,
    name VARCHAR,
    a AType
) OPTIONS (
    'keyFormat'='bigint',
    'valueFormat'='java',
    'valueJavaClass'='com.example.C'
);

Querying Support

Querying is provided with the field access operator, which has the following syntax:

(<mappingColumn>).typeAColumn.typeBColumn.typeCColumn

mappingColumn must be the top-level column inside a mapping that has a UDT as its type, whereas typeAColumn,typeBColumn and typeCColumn are all columns within the UDTs.

  1. The parentheses around mappingColumn are required.

  2. typeAColumn, typeBColumn and typeCColumn must be defined in their corresponding UDTs. Otherwise, the query will fail even if the underlying object contains fields with these names.

Examples

Querying Acyclic Types

Following examples use users mapping, and Organization and Office types.

Basic querying:

SELECT (organization).office.name FROM users

Selecting whole sub-object:

SELECT (organization).office FROM users
  1. When selecting the entire object, the query will always try to return the underlying object verbatim. For Java-serialized types, this means returning an underlying Java class instance, which can fail with a ClassNotFoundException if the class is not in the classpath of the client (or embedded server) JVM. To avoid this, you can select individual fields instead. This issue does not apply to Portable- or Compact-serialized types, as sub-objects in these mappings and types are GenericRecord subclasses; PortableGenericRecord and CompactGenericRecord are present in the base distribution of Hazelcast.

  2. For Avro-serialized types, the returned objects are subclasses of org.apache.avro.generic.GenericRecord, whose (de)serialization is supported by Java clients only.

Using projections:

SELECT (organization).id * 1000, ABS((organization).office.id) FROM users

Projections work as usual since field access expressions have virtually the same semantics and possible usage contexts as normal column projections.

Querying Cyclic Types

The following examples use AType.

package com.example;

class Wrapper {
    public A root;
}
CREATE MAPPING test (
    __key BIGINT,
    root AType
) TYPE IMap OPTIONS (
    'keyFormat'='bigint',
    'valueFormat'='java',
    'valueJavaClass'='com.example.Wrapper'
)

Assuming the following data is present in the table:

test table content

__key BIGINT root AType

1

<a1>

2

<a2>

A class instances

A1

// Cyclic structure where C1 references the root - A1.
// A1 -> B1 -> C1 -> [A1]
final A a1 = new A();
a1.b = new B();
a1.b.c = new C();
// loop back to A1
a1.b.c.a = a1;

a1.name = "A1";
a1.b.name = "B1";
a1.b.c.name = "C1";

A2

// Cyclic structure with additional chain with loop back to A2.
// A2 -> B2 -> C2 -> A3 -> C3 -> [A2]
final A a2 = new A();
a2.b = new B();
a2.b.c = new C();
a2.b.c.a = new A();
a2.b.c.a.b = new B();
a2.b.c.a.b.c = new C();
// loop back to A2
a2.b.c.a.b.c.a = a2;

a2.name = "A2";
a2.b.name = "B2";
a2.b.c.name = "C2";
a2.b.c.a.name = "A3";
a2.b.c.a.b.name = "B3"
a2.b.c.a.b.c.name = "C3"

Basic query:

SELECT
    (root).name AS v1,
    (root).b.name AS v2,
    (root).b.c.name AS v3,
    (root).b.c.a.name AS v4
FROM test
WHERE __key = 1

Result:

v1 VARCHAR v2 VARCHAR v3 VARCHAR v4 VARCHAR

'A1'

'B1'

'C1'

'A1'

Cyclic chain:

SELECT
    (root).b.c.a.b.c.a.b.c.a.b AS v1,
FROM test
WHERE __key = 1

Result:

v1 VARCHAR

'B1'

Accessing additional cyclic chain:

SELECT
    (root).b.c.a.name AS v1,
    (root).b.c.a.b.name AS v2,
    (root).b.c.a.b.c.name AS v3,
    (root).b.c.a.b.c.a.name AS v4
FROM test
WHERE __key = 2

Result:

v1 VARCHAR v2 VARCHAR v3 VARCHAR v4 VARCHAR

'A3'

'B3'

'C3'

'A2'

INSERT and UPDATE Support

INSERT and UPDATE queries are supported in a limited way, specifically:

  • INSERT and UPDATE queries are disabled for mappings that reference cyclic UDTs anywhere in the type hierarchy.

  • INSERT queries require specifying the full list of columns even if the column of a nested type needs to be set to NULL.

  • UPDATE queries only work on the root column and also require the full list of columns and sub-columns to work. Updating sub-columns is technically possible by specifying column projections in place of sub-columns that shouldn’t be changed.

  • Both UPDATE and INSERT queries use the Row Value expression, which is similar to the VALUES clause of an INSERT query.

Examples

The following examples use users mapping, and Organization and Office types.

The order of column values must be the same as the order of columns specified when executing the CREATE MAPPING and CREATE TYPE statements.

Basic insertion of UDT column:

INSERT INTO users VALUES (1, 'testUser', (1, 'organization1', (1, 'office1')))

Skipping initialization of certain columns:

(organization).name and (organization).office.id are null in this example.

INSERT INTO users VALUES (1, 'testUser', (1, null, (null, 'office1')))

Replacing whole column:

UPDATE users SET organization = (2, 'organization2', (2, 'office2'))

Replacing nested column value:

UPDATE users SET organization = ((organization).id, (organization).name, ((organization).office.id, 'new-office-name'))
When updating UDT columns, a value must be provided for every column in the UDT and its child UDTs unless it needs to be set to null. If a full list of columns is not provided, a query validation error occurs.

Inserting with query parameter:

final Office office = new Office();
office.id = 1L;
office.name = "office1";

final Organization organization = new Organization();
organization.id = 1L;
organization.name = "organization1";
organization.office = office;

hz.getSql().execute("INSERT INTO users VALUES (1, 'user1', ?)", organization);
For Avro-serialized types, the query parameters must be subclasses of org.apache.avro.generic.GenericRecord, whose (de)serialization is supported by Java clients only.

Updating with query parameter:

Using organization from the example above.

hz.getSql().execute("UPDATE users SET organization = ?", organization);

Updating nested UDT column with query parameter:

hz.getSql().execute("UPDATE users SET organization = ((organization).id, (organization).name, ?)", office);

Upgrade Notes

Enterprise Edition

When performing a normal or rolling upgrade from version 5.3 to 5.4, you must drop all user-defined types and mappings with UDTs before the upgrade, and recreate them with the new semantics after upgrading.