DirectQuery properties

Overview

These are the properties used to configure Atoti Market Risk when using DirectQuery.

Module properties

Key Description Default value
directquery.clustering.clustering-per-table Map to define the clustering fields per table. If for a given table no fields are defined via this property, the list of fields defined in the fields property is used.
directquery.clustering.enabled If clustering is enabled or not false
directquery.clustering.fields Fields to specify the Clustering on
directquery.database.databricks.connection-string The JDBC connection string
directquery.database.databricks.database The used remote database name
directquery.database.databricks.emulated-time-travel-description.column-type The type used for the {@link #emptyTableVersion} and {@link #validRowToColumnValue} values as {@link com.activeviam.database.api.types.ILiteralType} (string, double, …)
directquery.database.databricks.emulated-time-travel-description.empty-table-version

The version value for an empty table.

It is not possible to determine the version of a currently empty table with a discovery as there is no value. However, we still need to use time travel filters because new data can be inserted in future versions, and we don’t want it to impact the versions before insertion. To guarantee consistency, this value must be set to a value that is lower than all the future version numbers or timestamps, such as 0 or a timestamp in the past.

directquery.database.databricks.emulated-time-travel-description.time-travel-map Map of the tables with emulated time-travel.
directquery.database.databricks.emulated-time-travel-description.valid-row-to-column-value The value of the to column while the row is valid. Typical values would be null or a max value.
directquery.database.databricks.external-database-query-timeout Set the external database query timeout and validates user input validation. 1h
directquery.database.databricks.feeding-query-timeout

Timeout (in seconds) which can be set on a query running on the external database during the feeding.

Feeding queries are the ones run during the initial loading to feed the aggregate providers and hierarchies, and also during the refresh operations.

It defaults to 1 hour and cannot be less than a second.

This limit is only for external databases and does not impact queries on datastore.

Warning: increasing this value might result in longer queries and higher costs.

directquery.database.databricks.get-by-key-behavior

While Get-by-key queries performed on the datastore are very fast, they can take a long time on external databases, solely due to the networking time. When performing a Get-by-key query through DirectQuery, the possible behaviors are denying, warning, or allowing the query to run.

Defaults to {@link IDatabaseSettings#DEFAULT_GET_BY_KEY_QUERY_BEHAVIOR}.

directquery.database.databricks.gross-sum-native-array-udaf-name

Optional User Defined Aggregate Function name registered on Databricks to perform an element-wise gross sum on vectors (i.e. the sum of absolute values)

Check the Databricks Connector Atoti documentation to see how to define this function.

directquery.database.databricks.heavy-load-connection-string Optional special JDBC connection string used for feeding queries. See the Databricks Connector Atoti documentation to see how to use this feature.
directquery.database.databricks.jdbc-properties Additional properties used by the connection
directquery.database.databricks.long-sum-native-array-udaf-name

Optional User Defined Aggregate Function name registered on Databricks to perform an element-wise long sum on vectors (i.e. the sum of positive values)

Check the Databricks Connector Atoti documentation to see how to define this function.

directquery.database.databricks.max-sub-queries-allowed-in-multi-step-query

A query clustered by too many fields or by fields with a high cardinality can be divided in lots of sub queries. To avoid creating too many queries on the external database, a limit is set on the number of those sub queries. This property sets the limit.

It is recommended to be conservative when increasing the limit. See documentation for more details.

Defaults to {@value IDatabaseSettings#DEFAULT_MAX_SUB_QUERIES_ALLOWED_IN_MULTI_STEP_QUERY}.

directquery.database.databricks.password The password used by the connection
directquery.database.databricks.schema The used remote schema name
directquery.database.databricks.short-sum-native-array-udaf-name

Optional User Defined Aggregate Function name registered on Databricks to perform an element-wise short sum on vectors (i.e. the sum of negative values)

Check the Databricks Connector Atoti documentation to see how to define this function.

directquery.database.databricks.sum-native-array-udaf-name

Optional User Defined Aggregate Function name registered on Databricks to perform an element-wise sum on vectors.

Check the Databricks Connector Atoti documentation to see how to register this function.

directquery.database.databricks.sum-scalar-product-native-array-udaf-name

Optional User Defined Aggregate Function name registered on Databricks to perform an element-wise vector sum multiplied by a scalar product on each row.

Check the Databricks Connector Atoti documentation to see how to define this function.

directquery.database.databricks.time-travel-discovery-query-timeout

The timeout for the querying time-travel metadata during the discovery process.

Metadata querying is slower than regular querying, so this allows to have a different timeout.

directquery.database.databricks.time-travel-policy

Databricks view do not support time-travel queries. By default, if some tables in the schema are views, time-travel is disabled, but can be activated for mixed data model containing both tables and views, at the risk of data desynchronization as queries on views might return newly inserted/updated data.

Default value is {@code TimeTravelPolicy#STRICT}

directquery.database.databricks.use-clustering-fields

Returns whether clustering fields should be used for all queries or only for {@link QueryTags#CATEGORY_FEEDING feeding queries}.

Defaults to {@link IDatabaseSettings#DEFAULT_USE_CLUSTERING_FIELDS}.

directquery.database.mssql.connect-retry-count Number of connection retries 5
directquery.database.mssql.database The database name
directquery.database.mssql.emulated-time-travel-description.column-type The type used for the {@link #emptyTableVersion} and {@link #validRowToColumnValue} values as {@link com.activeviam.database.api.types.ILiteralType} (string, double, …)
directquery.database.mssql.emulated-time-travel-description.empty-table-version

The version value for an empty table.

It is not possible to determine the version of a currently empty table with a discovery as there is no value. However, we still need to use time travel filters because new data can be inserted in future versions, and we don’t want it to impact the versions before insertion. To guarantee consistency, this value must be set to a value that is lower than all the future version numbers or timestamps, such as 0 or a timestamp in the past.

directquery.database.mssql.emulated-time-travel-description.time-travel-map Map of the tables with emulated time-travel.
directquery.database.mssql.emulated-time-travel-description.valid-row-to-column-value The value of the to column while the row is valid. Typical values would be null or a max value.
directquery.database.mssql.encrypt Is the connection encrypted false
directquery.database.mssql.external-database-query-timeout Set the external database query timeout and validates user input validation. 1h
directquery.database.mssql.feeding-query-timeout

Timeout (in seconds) which can be set on a query running on the external database during the feeding.

Feeding queries are the ones run during the initial loading to feed the aggregate providers and hierarchies, and also during the refresh operations.

It defaults to 1 hour and cannot be less than a second.

This limit is only for external databases and does not impact queries on datastore.

Warning: increasing this value might result in longer queries and higher costs.

directquery.database.mssql.get-by-key-behavior

While Get-by-key queries performed on the datastore are very fast, they can take a long time on external databases, solely due to the networking time. When performing a Get-by-key query through DirectQuery, the possible behaviors are denying, warning, or allowing the query to run.

Defaults to {@link IDatabaseSettings#DEFAULT_GET_BY_KEY_QUERY_BEHAVIOR}.

directquery.database.mssql.host-name The database hostname
directquery.database.mssql.max-sub-queries-allowed-in-multi-step-query

A query clustered by too many fields or by fields with a high cardinality can be divided in lots of sub queries. To avoid creating too many queries on the external database, a limit is set on the number of those sub queries. This property sets the limit.

It is recommended to be conservative when increasing the limit. See documentation for more details.

Defaults to {@value IDatabaseSettings#DEFAULT_MAX_SUB_QUERIES_ALLOWED_IN_MULTI_STEP_QUERY}.

directquery.database.mssql.password The user password for this connection
directquery.database.mssql.port The database connection port 0
directquery.database.mssql.schema The schema name dbo
directquery.database.mssql.time-travel-discovery-query-timeout

The timeout for the querying time-travel metadata during the discovery process.

Metadata querying is slower than regular querying, so this allows to have a different timeout.

directquery.database.mssql.trust-server-certificate Does SSL use a truststore for certificates true
directquery.database.mssql.use-clustering-fields

Returns whether clustering fields should be used for all queries or only for {@link QueryTags#CATEGORY_FEEDING feeding queries}.

Defaults to {@link IDatabaseSettings#DEFAULT_USE_CLUSTERING_FIELDS}.

directquery.database.mssql.username The userName used by this connection
directquery.database.snowflake.additional-options Any additional properties that need t be set
directquery.database.snowflake.connection-string The JDBC connection string
directquery.database.snowflake.database The database where the schema is located, it will also feed SFSessionProperty.DATABASE
directquery.database.snowflake.emulated-time-travel-description.column-type The type used for the {@link #emptyTableVersion} and {@link #validRowToColumnValue} values as {@link com.activeviam.database.api.types.ILiteralType} (string, double, …)
directquery.database.snowflake.emulated-time-travel-description.empty-table-version

The version value for an empty table.

It is not possible to determine the version of a currently empty table with a discovery as there is no value. However, we still need to use time travel filters because new data can be inserted in future versions, and we don’t want it to impact the versions before insertion. To guarantee consistency, this value must be set to a value that is lower than all the future version numbers or timestamps, such as 0 or a timestamp in the past.

directquery.database.snowflake.emulated-time-travel-description.time-travel-map Map of the tables with emulated time-travel.
directquery.database.snowflake.emulated-time-travel-description.valid-row-to-column-value The value of the to column while the row is valid. Typical values would be null or a max value.
directquery.database.snowflake.external-database-query-timeout Set the external database query timeout and validates user input validation. 1h
directquery.database.snowflake.feeding-query-timeout

Timeout (in seconds) which can be set on a query running on the external database during the feeding.

Feeding queries are the ones run during the initial loading to feed the aggregate providers and hierarchies, and also during the refresh operations.

It defaults to 1 hour and cannot be less than a second.

This limit is only for external databases and does not impact queries on datastore.

Warning: increasing this value might result in longer queries and higher costs.

directquery.database.snowflake.feeding-warehouse The feeding warehouse
directquery.database.snowflake.get-by-key-behavior

While Get-by-key queries performed on the datastore are very fast, they can take a long time on external databases, solely due to the networking time. When performing a Get-by-key query through DirectQuery, the possible behaviors are denying, warning, or allowing the query to run.

Defaults to {@link IDatabaseSettings#DEFAULT_GET_BY_KEY_QUERY_BEHAVIOR}.

directquery.database.snowflake.gross-sum-native-array-udaf-name

Optional User Defined Aggregate Function name registered on Snowflake to perform an element-wise gross sum on vectors (i.e. the sum of absolute values)

Check the Snowflake Connector Atoti documentation to see how to define this function.

N-B: Defining UDAFs on Snowflake is a feature on private preview. It might not be available on your Snowflake account.

directquery.database.snowflake.long-sum-native-array-udaf-name

Optional User Defined Aggregate Function name registered on Snowflake to perform an element-wise long sum on vectors (i.e. the sum of positive values)

Check the Snowflake Connector Atoti documentation to see how to define this function.

N-B: Defining UDAFs on Snowflake is a feature on private preview. It might not be available on your Snowflake account.

directquery.database.snowflake.max-result-set-size The max size of the snowflake result set when split by Snowflake (in Bytes). 0
directquery.database.snowflake.max-sub-queries-allowed-in-multi-step-query

A query clustered by too many fields or by fields with a high cardinality can be divided in lots of sub queries. To avoid creating too many queries on the external database, a limit is set on the number of those sub queries. This property sets the limit.

It is recommended to be conservative when increasing the limit. See documentation for more details.

Defaults to {@value IDatabaseSettings#DEFAULT_MAX_SUB_QUERIES_ALLOWED_IN_MULTI_STEP_QUERY}.

directquery.database.snowflake.password The password for this connection, it will fill the SFSessionProperty.PASSWORD entry
directquery.database.snowflake.role The user role used for the database access, it will fill SFSessionProperty.ROLE
directquery.database.snowflake.schema The schema used by direct query, it will also feed SFSessionProperty.SCHEMA
directquery.database.snowflake.short-sum-native-array-udaf-name

Optional User Defined Aggregate Function name registered on Snowflake to perform an element-wise short sum on vectors (i.e. the sum of negative values)

Check the Snowflake Connector Atoti documentation to see how to define this function.

N-B: Defining UDAFs on Snowflake is a feature on private preview. It might not be available on your Snowflake account.

directquery.database.snowflake.sum-native-array-udaf-name

Optional User Defined Aggregate Function name registered on Snowflake to perform an element-wise sum on vectors.

Check the Snowflake Connector Atoti documentation to see how to define this function.

N-B: Defining UDAFs on Snowflake is a feature on private preview. It might not be available on your Snowflake account.

directquery.database.snowflake.sum-scalar-product-native-array-udaf-name

Optional User Defined Aggregate Function name registered on Snowflake to perform an element-wise vector sum multiplied by a scalar product on each row.

Check the Snowflake Connector Atoti documentation to see how to define this function.

N-B: Defining UDAFs on Snowflake is a feature on private preview. It might not be available on your Snowflake account.

directquery.database.snowflake.time-travel-discovery-query-timeout

The timeout for the querying time-travel metadata during the discovery process.

Metadata querying is slower than regular querying, so this allows to have a different timeout.

directquery.database.snowflake.use-cache USE_CACHED_RESULT_PROPERTY true
directquery.database.snowflake.use-clustering-fields

Returns whether clustering fields should be used for all queries or only for {@link QueryTags#CATEGORY_FEEDING feeding queries}.

Defaults to {@link IDatabaseSettings#DEFAULT_USE_CLUSTERING_FIELDS}.

directquery.database.snowflake.use-native-time-travel

Use the native Snowflake time travel

Defaults to {@link SnowflakeDatabaseSettings#DEFAULT_USE_NATIVE_TIME_TRAVEL}.

directquery.database.snowflake.username The username for this connection, it will fill the SFSessionProperty.USER entry
directquery.database.snowflake.warehouse The used warehouse for computation
directquery.database.type The type of the used database, for instance clickhouse, databricks, mssql or snowflake
directquery.enabled Set to true to enable DirectQuery feature false
directquery.schema.validation.fail-startup-on-bad-schema Does any schema validation error will halt the server. true
directquery.schema.validation.table-validation-query-timeout Duration that defines the timeout for the validation queries. 1m
directquery.schema.validation.validate-on-startup Should the schema be validated upon startup of the sever. true
directquery.with-adjustments

The Database will also hold the adjustments. {{< notice note >}} Sign-off is not supported with DirectQuery, but you may want to deploy a cluster in which there is:

  • one data node that contains in-memory data and for which Sign-Off features are enabled
  • one DirectQuery data node. If the DirectQuery database is used with additional fields in the tables where adjustments can be present, that flag needs to be set to true.

See Adjustments in the DirectQuery database for more details.{{< /notice >}}

false