atoti_directquery_databricks.ConnectionConfig#

final class atoti_directquery_databricks.ConnectionConfig#

Config to connect to a Databricks database.

Example

>>> import os
>>> from atoti_directquery_databricks import ConnectionConfig
>>> connection_config = ConnectionConfig(
...     url="jdbc:databricks://"
...     + os.environ["DATABRICKS_SERVER_HOSTNAME"]
...     + "/default;"
...     + "transportMode=http;"
...     + "ssl=1;"
...     + "httpPath="
...     + os.environ["DATABRICKS_HTTP_PATH_SQL_WAREHOUSE"]
...     + ";"
...     + "AuthMech=3;"
...     + "UID=token;",
...     password=os.environ["DATABRICKS_AUTH_TOKEN"],
... )
>>> external_database = session.connect_to_external_database(connection_config)

array_long_agg_function_name: str | None = None#: The name (if different from the default) of the UDAF performing atoti.agg.long() on native arrays.

Deprecated since version 0.9.14: Spark UDAFs API is not recommended by Databricks, use array conversion instead.

array_short_agg_function_name: str | None = None#: The name (if different from the default) of the UDAF performing atoti.agg.short() on native arrays.

Deprecated since version 0.9.14: Spark UDAFs API is not recommended by Databricks, use array conversion instead.

array_sum_agg_function_name: str | None = None#: The name (if different from the default) of the UDAF performing atoti.agg.sum() on native arrays.

Deprecated since version 0.9.14: Spark UDAFs API is not recommended by Databricks, use array conversion instead.

array_sum_product_agg_function_name: str | None = None#: The name (if different from the default) of the UDAF performing atoti.agg.sum_product() on native arrays.

Deprecated since version 0.9.14: Spark UDAFs API is not recommended by Databricks, use array conversion instead.

auto_multi_column_array_conversion: AutoMultiColumnArrayConversion | None = None#: When not None, multi-column array conversion will be performed automatically.

column_clustered_queries: 'all' | 'feeding' = 'feeding'#: Control which queries will use clustering columns.

feeding_query_timeout: Duration = datetime.timedelta(seconds=3600)#

Timeout for queries performed on the external database during feeding phases.

The feeding phases are:

the initial load to feed aggregate_providers and hierarchies;
the refresh operations.

feeding_url: str | None = None#: When not None, this JDBC connection string will be used instead of url for the feeding phases.

lookup_mode: 'allow' | 'warn' | 'deny' = 'warn'#

Whether lookup queries on the external database are allowed.

Lookup can be very slow and expensive as the database may not enforce primary keys.

max_sub_queries: Annotated[int, Field(gt=0)] = 500#: Maximum number of sub queries performed when splitting a query into multi-step queries.

password: str | None = None#

The password to connect to the database.

Passing it in this separate attribute prevents it from being logged alongside the connection string.

If None, a password is expected to be present in url.

query_timeout: Duration = datetime.timedelta(seconds=300)#: Timeout for queries performed on the external database outside feeding phases.

time_travel: Literal[False, 'lax', 'strict'] = 'strict'#

How to use Databricks’ time travel feature.

Databricks does not support time travel with views, so the options are:

False: tables and views are queried on the latest state of the database.
"lax": tables are queried with time travel but views are queried without it.
"strict": tables are queried with time travel and querying a view raises an error.

url: str#: The JDBC connection string.