Aggregate Cache

Introduction

The aggregate cache stores the results of computed aggregates so that subsequent queries that need the same aggregate can be served without recomputing it. It sits at the top of the query engine: when a query is being planned, the cache is the very first thing consulted, before the aggregate provider and before any other retrieval source. This is what makes it different from, and complementary to, the aggregate provider:

The aggregate provider is a retrieval source. It pre-computes aggregates at feeding time at a chosen granularity, and it is queried during the retrieval phase as one possible source of data.
The aggregate cache is a higher-level layer that short-circuits the entire retrieval graph for an aggregate that has already been computed. It is populated lazily as queries arrive, and entries are evicted according to a least-recently-used (LRU) policy.

When both are configured, the cache is checked first; on a miss, the regular retrieval graph runs.

How it works

Cache key

Each cache entry is uniquely identified by:

The query location.
The measure being computed.
The active cube filter.
The relevant context values for the query.

Cache size unit

While the key has several components, the size of the cache is counted in (location, measure) pairs, not in number of entries and not in bytes. The cache stores one entry per (location, measure) request, but a location may be a wildcard or range pattern that expands to several point locations. The weight of an entry is the number of point locations its expansion covers, not 1. The cache capacity therefore bounds the total number of point (location, measure) pairs cached across all entries.

For example, with a cache configured at withSize(1000):

A query computing one measure at a single point location (such as Year = 2025) stores one entry weighting 1 slot.
A query returning TotalSales and TotalQuantity for Year = 2025 stores two entries weighting 1 slot each, for a total of 2 slots.
A query expanding Day = * under Year = 2025 for one measure stores a single entry whose expansion covers 365 point locations, so it weights 365 slots, close to half the cache from one query.

Once the 1000 slots are full, the least recently used entries are evicted to make room for new ones.

Cache hits and misses

During the planning phase of a query, Atoti looks up each required (location, measure) pair in the cache:

On a hit, the cached value is returned and the corresponding leg of the retrieval graph is short-circuited. In the query execution plan, this appears as a PrimitiveAggregatesCacheRetrieval (or PostProcessedCacheRetrieval for post-processors).
On a miss, the aggregate is computed by the regular retrieval chain (aggregate provider, JIT provider, datastore, …) and the result is inserted into the cache. When the cache is full, the least recently used entry is evicted to make room.

When several queries running at the same time need the same (location, measure) pair, the cache ensures the aggregate is computed only once: the first query performs the computation and the others wait for its result. This avoids duplicate work when the same dashboard is opened by many users at once, or when a query fans out into sub-queries that overlap.

Cache lifecycle

The cache is built when the cube starts, from the withAggregatesCache() description. It then goes through the following states:

Active: the internal map is initialized with the configured capacity. Stream listeners are attached lazily, the first time a measure that depends on a given stream is queried.
Invalidated: on every transaction commit (or any other event that changes the data visible to queries), the cache is invalidated: the internal map is replaced by a fresh one and an internal version counter is incremented. The version is part of the cache key, so any entry computed against a previous version is automatically ignored.
Discarded: if the capacity is set to a strictly negative value (via disabled(), withSize(-1) or setCapacity(-1) at runtime), the cache stops storing anything, releases its internal structures, and unsubscribes from all streams. Calls to the cache become no-ops.

Capacity changes at runtime go through the same path: a new capacity triggers an invalidation of the existing entries, and a negative capacity triggers a discard.

Enabling and configuring the cache

The aggregate cache is configured per cube using the withAggregatesCache() section of the StartBuilding.cube() fluent builder. The following example configures a cache with a fixed capacity and restricts caching to two specific measures.

StartBuilding.cube()
    .withName("MyCube")
    .withMeasures(measures)
    .withDimensions(dimensions)
    .withAggregateProvider()
    .jit()
    .withAggregatesCache()
    .withSize(10_000)
    .cachingOnlyMeasures("contributors.COUNT", "pnl.SUM")
    .build();

Available builder options

Option	Effect
`.withSize(int)`	Sets the maximum number of `(location, measure)` pairs the cache can hold.
`.cachingOnlyMeasures(String...)`	Restricts caching to the listed measures (include mode).
`.cachingAllMeasuresBut(String...)`	Caches every measure except the listed ones (exclude mode).
`.cachingOnlyBranches(String...)`	Restricts caching to the listed data versioning branches.
`.disabled()`	Shorthand for `withSize(-1)`. Discards the cache entirely.

Size semantics

The value passed to withSize(int) controls both the capacity and the operating mode of the cache:

Value of `size`	Storage of results	Concurrent computation sharing
Strictly negative	disabled (discarded)	disabled
`0`	disabled	enabled
Strictly positive	enabled with that capacity	enabled

A size of 0 is the way to keep computation sharing between concurrent identical queries without retaining any result for later queries.

note

Because the size counts (location, measure) pairs, the actual memory footprint of the cache depends on the size of each cached aggregate. Measures returning vectors or arrays consume significantly more memory per entry than scalar measures.

Monitoring

The aggregate cache is exposed as a monitored component, so its state and effectiveness can be inspected through the standard JMX/management interfaces.

Attributes (current state):

NbOfEntries: the current number of (location, measure) pairs stored in the cache.
Capacity: the configured maximum capacity.

Operations (callable at runtime):

invalidate: clears all entries and bumps the cache version.
setCapacity(int): resizes the cache; passing a negative value discards it.

Statistics (cumulative since last reset):

Cache hits: total number of hits across all keys.
Cache misses: total number of misses.

Combined with the query execution plan, where PrimitiveAggregatesCacheRetrieval and PostProcessedCacheRetrieval indicate cache hits, these counters provide visibility into how effective the cache is for a given workload.

Performance considerations

When the cache helps

The aggregate cache delivers the largest gains when:

Queries repeatedly request the same (location, measure) pairs, for example dashboards refreshed by many users, drill paths users navigate up and down, or scheduled reports.
The retrieval source for those aggregates is expensive: a JIT provider scanning large amounts of data, a DirectQuery call to an external database, or a complex post-processor.
Data changes infrequently relative to query volume, so cache entries live long enough to be reused before being invalidated.

When the cache helps less

The cache has limited or no benefit when:

Queries hit a different (location, measure) pair every time, so entries are evicted before being reused.
The application performs many transactions, each of which invalidates the cache.

Sizing the cache

The cache size should reflect the working set of (location, measure) pairs expected to be reused. Useful inputs:

The number of distinct queries observed during a typical user session or dashboard refresh.
The number of measures listed in cachingOnlyMeasures(...) (or all measures if no list is set).
The available heap budget, taking into account that vector measures consume disproportionately more memory per entry.

A reasonable starting point is a size large enough to hold the working set of one or two representative workloads. Adjust based on observed hit ratios.

Distributed cubes

In a distributed cluster, each data cube maintains its own aggregate cache. As described in Distributed Post-Processors, distributed prefetchers can be used to pre-compute aggregates on the data cubes before sending them to the query cube. This way, the most popular intermediate results are served from the data cubes' caches instead of being recomputed on every query.

The aggregate cache can also be configured on the query cube itself, in which case it caches the results assembled from the data cubes' contributions.

Best practices

Enable the cache only when measurements show repeated (location, measure) accesses. A disabled cache (the default) costs nothing; an oversized cache wastes heap.
Restrict caching to expensive measures with cachingOnlyMeasures(...). Caching cheap aggregates pollutes the cache and pushes useful entries out via LRU eviction.
Be especially careful with vector measures. Their entries are large; either exclude them with cachingAllMeasuresBut(...), or size the cache conservatively.
Use withSize(0) to share computation between concurrent identical queries without retaining the result. For example, in write-heavy workloads where any stored result would be invalidated almost immediately.
Restrict to active branches with cachingOnlyBranches(...) when only a subset of branches is queried interactively.

Introduction​

How it works​

Cache key​

Cache size unit​

Cache hits and misses​

Concurrent computation sharing​

Cache lifecycle​

Enabling and configuring the cache​

Available builder options​

Size semantics​

Monitoring​

Performance considerations​

When the cache helps​

When the cache helps less​

Sizing the cache​

Distributed cubes​

Best practices​