Skip to main content
Data loading is the process of inserting external data into Atoti’s in-memory datastore.
The datastore enables fast, real-time analytics by keeping data readily accessible for queries and computations.
Atoti DirectQuery is an alternative approach that connects to external databases without loading all the data into memory first

What is data loading in Atoti?

Data loading is the process of inserting external data into Atoti’s in-memory datastore.
The datastore keeps data in memory to support fast analytical queries and computations.
Loading happens at the end of an extraction and transformation process.

Which data sources can connect to Atoti?

Atoti supports a wide range of external data sources:
  • Flat files: (e.g. CSV, Parquet)
  • Relational databases: via JDBC (e.g., PostgreSQL, Oracle, SQL Server)
  • Datawarehouses: (e.g. BigQuery, Snowflake)
  • Messaging systems: (e.g. Kafka, JMS)
  • Custom systems: bespoke APIs or data platforms

How is data loaded?

Loading is the final step in the data pipeline where transformed data is inserted into the datastore. Atoti uses a transactional model to ensure consistency and isolation during this process.

What are datastore transactions

A datastore transaction refers to a sequence of operations performed on a datastore that are executed as a single, atomic unit. This means either all the operations in the transaction succeed, or none of them succeed. This approach ensures data consistency and integrity. Queries to Atoti only see committed transactions. Find out more about how to load data from these sources with Atoti Java SDK Find out more about how to load data with Atoti Python SDK

What is DirectQuery?

DirectQuery connects Atoti to an external database without loading the data into the in-memory datastore.
Atoti delegates queries to the external system.
This reduces memory use and enables access to large or frequently updated datasets.
DirectQuery is suited to enterprise databases such as Snowflake, BigQuery, or ClickHouse.

How does DirectQuery manage data updates and versioning?

DirectQuery relies on the external database for versioning.
When the database supports native time travel, DirectQuery uses it automatically.
When it does not, time travel can be emulated through versioning columns.
Two refresh strategies are available:
  • Incremental refresh
  • Full refresh
These strategies ensure that the cube stays aligned with external changes. Find out more about DirectQuery