ClickHouse
Time-travel support
Time-travel is not supported by ClickHouse. This means that updating applications based on this Database are subject to data desynchronization in case of changes in the Database.
Read this page to learn more about the visible effects of this desynchronization.
Vectors support
ClickHouse handles vectors natively, via the Array
type in the model and thanks to special functions at query time.
Thus, in DirectQuery, this is the only supported mode for dealing with this data type.
Table engines support
Among the MergeTree family, Atoti supports only the simple MergeTree
table engine (as well as it's Replicated or Distributed variant).
The reason behind is that more complex table engines (such as AggregatingMergeTree
) require more complex SQL queries
to return accurate results.
See more information on table engines.
Supported types
The following Clickhouse data types are supported by Atoti, i.e. they are automatically read and converted to their Java equivalent:
- UInt8, UInt16, UInt32, UInt64, UInt128, UInt256
- Int8, Int16, Int32, Int64, Int128, Int256
- Float32, Float64
- Decimal, Decimal32, Decimal64, Decimal128, Decimal256
- Bool
- String, FixedString
- Date, Date32
- DateTime, DateTime32, DateTime64
- Array (only arrays of scalar are supported, not arrays of arrays)
- Enum, Enum8, Enum16, UUID, IPv4, IPv6 (they are converted to String)
Other types can only be used with a custom type reader for the specific columns using an unsupported type.
Gotchas
Null values as a join result (join_use_nulls=1)
Our Clickhouse connector uses the SQL standard regarding the handling of null values as a result of a join.
For example on a left join some missing keys in the right table will result in null value in returned rows.
However, the default behavior in ClickHouse is to fill empty cells with default values (based on the type) instead of null.
If querying outside our java connector, in a standard sql client for instance, you can set this property adding settings join_use_nulls=1
after your select query.
Nullable fields
ClickHouse can define fields with nullable types.
DirectQuery is capable of detecting this and defines its internal model accordingly.
While this is not a problem in itself, it can conflict with the rule in Atoti cubes that all levels must be based on non-nullable values.
This does not create an issue of any sort as the local model is updated behind the scene, assigning a default value based on the type.
It has a side effect on the query performance, as DirectQuery must convert on-the-fly null values to their assigned default values, as well as adding extra conditions during joins to handle null values on both sides.
Flatten the data model
Atoti Datastore technology has highly optimized joins. This is not the case in ClickHouse, that has general purpose joins.
As ClickHouse is very fast at handling wide tables (since it's stored by columns), it is advised to use as few tables as possible and flatten the data model.
Dictionaries are an alternative to joined tables for small cardinalities. They are supported by DirectQuery and presented as standard tables.
One notable drawback of using a flat model is that building one hierarchy requires a full scan of a large table. As a consequence, the more non-virtual hierarchies a project has, the longer it takes to have a project ready to use.
Foreign keys
Joins between tables perform faster when the keys are of an integer type (such as UInt64
). It's advised to avoid other types, such as String
.