Aggregate Cache
Introduction
The aggregate cache stores the results of computed aggregates so that subsequent queries that need the same aggregate can be served without recomputing it. It sits at the top of the query engine: when a query is being planned, the cache is the very first thing consulted, before the aggregate provider and before any other retrieval source. This is what makes it different from, and complementary to, the aggregate provider:
- The aggregate provider is a retrieval source. It pre-computes aggregates at feeding time at a chosen granularity, and it is queried during the retrieval phase as one possible source of data.
- The aggregate cache is a higher-level layer that short-circuits the entire retrieval graph for an aggregate that has already been computed. It is populated lazily as queries arrive, and entries are evicted according to a least-recently-used (LRU) policy.
When both are configured, the cache is checked first; on a miss, the regular retrieval graph runs.
How it works
Cache key
Each cache entry is uniquely identified by:
- The query location.
- The measure being computed.
- The active cube filter.
- The relevant context values for the query.
Cache size unit
While the key has several components, the size of the cache is counted in (location, measure)
pairs, not in number of entries and not in bytes. The cache stores one entry per
(location, measure) request, but a location may be a wildcard or range pattern that expands to
several point locations. The weight of an entry is the number of point locations its expansion
covers, not 1. The cache capacity therefore bounds the total number of point
(location, measure) pairs cached across all entries.
For example, with a cache configured at withSize(1000):
- A query computing one measure at a single point location (such as
Year = 2025) stores one entry weighting 1 slot. - A query returning
TotalSalesandTotalQuantityforYear = 2025stores two entries weighting 1 slot each, for a total of 2 slots. - A query expanding
Day = *underYear = 2025for one measure stores a single entry whose expansion covers 365 point locations, so it weights 365 slots, close to half the cache from one query.
Once the 1000 slots are full, the least recently used entries are evicted to make room for new ones.
Cache hits and misses
During the planning phase of a query, Atoti looks up each required (location, measure) pair in
the cache:
- On a hit, the cached value is returned and the corresponding leg of the retrieval graph is
short-circuited. In the query execution plan, this
appears as a
PrimitiveAggregatesCacheRetrieval(orPostProcessedCacheRetrievalfor post-processors). - On a miss, the aggregate is computed by the regular retrieval chain (aggregate provider, JIT provider, datastore, …) and the result is inserted into the cache. When the cache is full, the least recently used entry is evicted to make room.
Concurrent computation sharing
When several queries running at the same time need the same (location, measure) pair, the cache
ensures the aggregate is computed only once: the first query performs the computation and the
others wait for its result. This avoids duplicate work when the same dashboard is opened by many
users at once, or when a query fans out into sub-queries that overlap.
Cache lifecycle
The cache is built when the cube starts, from the withAggregatesCache() description. It then goes
through the following states:
- Active: the internal map is initialized with the configured capacity. Stream listeners are attached lazily, the first time a measure that depends on a given stream is queried.
- Invalidated: on every transaction commit (or any other event that changes the data visible to queries), the cache is invalidated: the internal map is replaced by a fresh one and an internal version counter is incremented. The version is part of the cache key, so any entry computed against a previous version is automatically ignored.
- Discarded: if the capacity is set to a strictly negative value (via
disabled(),withSize(-1)orsetCapacity(-1)at runtime), the cache stops storing anything, releases its internal structures, and unsubscribes from all streams. Calls to the cache become no-ops.
Capacity changes at runtime go through the same path: a new capacity triggers an invalidation of the existing entries, and a negative capacity triggers a discard.
Enabling and configuring the cache
The aggregate cache is configured per cube using the withAggregatesCache() section of the
StartBuilding.cube() fluent builder. The following example configures a cache with a fixed
capacity and restricts caching to two specific measures.
StartBuilding.cube()
.withName("MyCube")
.withMeasures(measures)
.withDimensions(dimensions)
.withAggregateProvider()
.jit()
.withAggregatesCache()
.withSize(10_000)
.cachingOnlyMeasures("contributors.COUNT", "pnl.SUM")
.build();
Available builder options
| Option | Effect |
|---|---|
.withSize(int) | Sets the maximum number of (location, measure) pairs the cache can hold. |
.cachingOnlyMeasures(String...) | Restricts caching to the listed measures (include mode). |
.cachingAllMeasuresBut(String...) | Caches every measure except the listed ones (exclude mode). |
.cachingOnlyBranches(String...) | Restricts caching to the listed data versioning branches. |
.disabled() | Shorthand for withSize(-1). Discards the cache entirely. |
Size semantics
The value passed to withSize(int) controls both the capacity and the operating mode of the cache:
Value of size | Storage of results | Concurrent computation sharing |
|---|---|---|
| Strictly negative | disabled (discarded) | disabled |
0 | disabled | enabled |
| Strictly positive | enabled with that capacity | enabled |
A size of 0 is the way to keep computation sharing between concurrent identical queries without
retaining any result for later queries.
Because the size counts (location, measure) pairs, the actual memory footprint of the cache
depends on the size of each cached aggregate. Measures returning vectors or arrays consume
significantly more memory per entry than scalar measures.
Monitoring
The aggregate cache is exposed as a monitored component, so its state and effectiveness can be inspected through the standard JMX/management interfaces.
Attributes (current state):
NbOfEntries: the current number of(location, measure)pairs stored in the cache.Capacity: the configured maximum capacity.
Operations (callable at runtime):
invalidate: clears all entries and bumps the cache version.setCapacity(int): resizes the cache; passing a negative value discards it.
Statistics (cumulative since last reset):
Cache hits: total number of hits across all keys.Cache misses: total number of misses.
Combined with the query execution plan, where
PrimitiveAggregatesCacheRetrieval and PostProcessedCacheRetrieval indicate cache hits, these
counters provide visibility into how effective the cache is for a given workload.
Performance considerations
When the cache helps
The aggregate cache delivers the largest gains when:
- Queries repeatedly request the same
(location, measure)pairs, for example dashboards refreshed by many users, drill paths users navigate up and down, or scheduled reports. - The retrieval source for those aggregates is expensive: a JIT provider scanning large amounts of data, a DirectQuery call to an external database, or a complex post-processor.
- Data changes infrequently relative to query volume, so cache entries live long enough to be reused before being invalidated.
When the cache helps less
The cache has limited or no benefit when:
- Queries hit a different
(location, measure)pair every time, so entries are evicted before being reused. - The application performs many transactions, each of which invalidates the cache.
Sizing the cache
The cache size should reflect the working set of (location, measure) pairs expected to be reused.
Useful inputs:
- The number of distinct queries observed during a typical user session or dashboard refresh.
- The number of measures listed in
cachingOnlyMeasures(...)(or all measures if no list is set). - The available heap budget, taking into account that vector measures consume disproportionately more memory per entry.
A reasonable starting point is a size large enough to hold the working set of one or two representative workloads. Adjust based on observed hit ratios.
Distributed cubes
In a distributed cluster, each data cube maintains its own aggregate cache. As described in Distributed Post-Processors, distributed prefetchers can be used to pre-compute aggregates on the data cubes before sending them to the query cube. This way, the most popular intermediate results are served from the data cubes' caches instead of being recomputed on every query.
The aggregate cache can also be configured on the query cube itself, in which case it caches the results assembled from the data cubes' contributions.
Best practices
- Enable the cache only when measurements show repeated
(location, measure)accesses. A disabled cache (the default) costs nothing; an oversized cache wastes heap. - Restrict caching to expensive measures with
cachingOnlyMeasures(...). Caching cheap aggregates pollutes the cache and pushes useful entries out via LRU eviction. - Be especially careful with vector measures. Their entries are large; either exclude them
with
cachingAllMeasuresBut(...), or size the cache conservatively. - Use
withSize(0)to share computation between concurrent identical queries without retaining the result. For example, in write-heavy workloads where any stored result would be invalidated almost immediately. - Restrict to active branches with
cachingOnlyBranches(...)when only a subset of branches is queried interactively.