> ## Documentation Index
> Fetch the complete documentation index at: https://docs.activeviam.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Aggregate Cache

> The aggregate cache in Atoti Server stores computed (location, measure) pairs above the retrieval graph, short-circuiting repeated queries without recomputation; configured per cube with withAggregatesCache() with options for measure filtering, branch filtering, and runtime capacity adjustment via JMX

## Introduction

The **aggregate cache** stores the results of computed aggregates so that subsequent queries that
need the same aggregate can be served without recomputing it. It sits at the **top of the query
engine**: when a query is being planned, the cache is the very first thing consulted, before the
[aggregate provider](./providers/aggregate_provider) and before any other retrieval source. This is what
makes it different from, and complementary to, the aggregate provider:

* The **aggregate provider** is a retrieval source. It pre-computes aggregates at feeding time at a
  chosen granularity, and it is queried during the retrieval phase as one possible source of data.
* The **aggregate cache** is a higher-level layer that short-circuits the entire retrieval graph
  for an aggregate that has already been computed. It is populated lazily as queries arrive, and
  entries are evicted according to a least-recently-used (LRU) policy.

When both are configured, the cache is checked first; on a miss, the regular retrieval graph runs.

## How it works

### Cache key

Each cache entry is uniquely identified by:

1. The query **location**.
2. The **measure** being computed.
3. The active [cube filter](./filtering).
4. The relevant [context values](./context_values) for the query.

### Cache size unit

While the key has several components, the **size of the cache is counted in `(location, measure)`
pairs**, not in number of entries and not in bytes. The cache stores one entry per
`(location, measure)` request, but a location may be a wildcard or range pattern that expands to
several **point locations**. The weight of an entry is the number of point locations its expansion
covers, not 1. The cache capacity therefore bounds the total number of point
`(location, measure)` pairs cached across all entries.

For example, with a cache configured at `withSize(1000)`:

* A query computing one measure at a single point location (such as `Year = 2025`) stores one entry
  weighting 1 slot.
* A query returning `TotalSales` and `TotalQuantity` for `Year = 2025` stores two entries
  weighting 1 slot each, for a total of 2 slots.
* A query expanding `Day = *` under `Year = 2025` for one measure stores a single entry whose
  expansion covers 365 point locations, so it weights 365 slots, close to half the cache from one
  query.

Once the 1000 slots are full, the least recently used entries are evicted to make room for new
ones.

### Cache hits and misses

During the planning phase of a query, Atoti looks up each required `(location, measure)` pair in
the cache:

* On a **hit**, the cached value is returned and the corresponding leg of the retrieval graph is
  short-circuited. In the [query execution plan](../monitoring/query_execution_plan), this
  appears as a `PrimitiveAggregatesCacheRetrieval` (or `PostProcessedCacheRetrieval` for
  post-processors).
* On a **miss**, the aggregate is computed by the regular retrieval chain (aggregate provider, JIT
  provider, datastore, …) and the result is inserted into the cache. When the cache is full, the
  least recently used entry is evicted to make room.

### Concurrent computation sharing

When several queries running at the same time need the same `(location, measure)` pair, the cache
ensures the aggregate is computed only once: the first query performs the computation and the
others wait for its result. This avoids duplicate work when the same dashboard is opened by many
users at once, or when a query fans out into sub-queries that overlap.

### Cache lifecycle

The cache is built when the cube starts, from the `withAggregatesCache()` description. It then goes
through the following states:

* **Active**: the internal map is initialized with the configured capacity. Stream listeners
  are attached lazily, the first time a measure that depends on a given stream is queried.
* **Invalidated**: on every transaction commit (or any other event that changes the data
  visible to queries), the cache is invalidated: the internal map is replaced by a fresh one and an
  internal **version counter** is incremented. The version is part of the cache key, so any entry
  computed against a previous version is automatically ignored.
* **Discarded**: if the capacity is set to a strictly negative value (via `disabled()`,
  `withSize(-1)` or `setCapacity(-1)` at runtime), the cache stops storing anything, releases its
  internal structures, and unsubscribes from all streams. Calls to the cache become no-ops.

Capacity changes at runtime go through the same path: a new capacity triggers an invalidation of
the existing entries, and a negative capacity triggers a discard.

## Enabling and configuring the cache

The aggregate cache is configured per cube using the `withAggregatesCache()` section of the
`StartBuilding.cube()` fluent builder. The following example configures a cache with a fixed
capacity and restricts caching to two specific measures.

```java theme={"languages":{"custom":["/engine/python-sdk/0.9/languages/pycon.tmLanguage.json"]}}
StartBuilding.cube()
    .withName("MyCube")
    .withMeasures(measures)
    .withDimensions(dimensions)
    .withAggregateProvider()
    .jit()
    .withAggregatesCache()
    .withSize(10_000)
    .cachingOnlyMeasures("contributors.COUNT", "pnl.SUM")
    .build();
```

### Available builder options

| Option                              | Effect                                                                                   |
| ----------------------------------- | ---------------------------------------------------------------------------------------- |
| `.withSize(int)`                    | Sets the maximum number of `(location, measure)` pairs the cache can hold.               |
| `.cachingOnlyMeasures(String...)`   | Restricts caching to the listed measures (include mode).                                 |
| `.cachingAllMeasuresBut(String...)` | Caches every measure except the listed ones (exclude mode).                              |
| `.cachingOnlyBranches(String...)`   | Restricts caching to the listed [data versioning](../concepts/data_versioning) branches. |
| `.disabled()`                       | Shorthand for `withSize(-1)`. Discards the cache entirely.                               |

### Size semantics

The value passed to `withSize(int)` controls both the capacity and the operating mode of the cache:

| Value of `size`   | Storage of results         | Concurrent computation sharing |
| ----------------- | -------------------------- | ------------------------------ |
| Strictly negative | disabled (discarded)       | disabled                       |
| `0`               | disabled                   | enabled                        |
| Strictly positive | enabled with that capacity | enabled                        |

A size of `0` is the way to keep computation sharing between concurrent identical queries without
retaining any result for later queries.

<Note>
  Because the size counts `(location, measure)` pairs, the actual memory footprint of the cache
  depends on the size of each cached aggregate. Measures returning vectors or arrays consume
  significantly more memory per entry than scalar measures.
</Note>

## Monitoring

The aggregate cache is exposed as a monitored component, so its state and effectiveness can be
inspected through the standard JMX/management interfaces.

**Attributes** (current state):

* `NbOfEntries`: the current number of `(location, measure)` pairs stored in the cache.
* `Capacity`: the configured maximum capacity.

**Operations** (callable at runtime):

* `invalidate`: clears all entries and bumps the cache version.
* `setCapacity(int)`: resizes the cache; passing a negative value discards it.

**Statistics** (cumulative since last reset):

* `Cache hits`: total number of hits across all keys.
* `Cache misses`: total number of misses.

Combined with the [query execution plan](../monitoring/query_execution_plan), where
`PrimitiveAggregatesCacheRetrieval` and `PostProcessedCacheRetrieval` indicate cache hits, these
counters provide visibility into how effective the cache is for a given workload.

## Performance considerations

### When the cache helps

The aggregate cache delivers the largest gains when:

* Queries repeatedly request the same `(location, measure)` pairs, for example dashboards
  refreshed by many users, drill paths users navigate up and down, or scheduled reports.
* The retrieval source for those aggregates is expensive: a JIT provider scanning large amounts of
  data, a DirectQuery call to an external database, or a complex post-processor.
* Data changes infrequently relative to query volume, so cache entries live long enough to be
  reused before being invalidated.

### When the cache helps less

The cache has limited or no benefit when:

* Queries hit a different `(location, measure)` pair every time, so entries are evicted before
  being reused.
* The application performs many transactions, each of which invalidates the cache.

### Sizing the cache

The cache size should reflect the working set of `(location, measure)` pairs expected to be reused.
Useful inputs:

* The number of distinct queries observed during a typical user session or dashboard refresh.
* The number of measures listed in `cachingOnlyMeasures(...)` (or all measures if no list is set).
* The available heap budget, taking into account that vector measures consume disproportionately
  more memory per entry.

A reasonable starting point is a size large enough to hold the working set of one or two
representative workloads. Adjust based on observed hit ratios.

## Distributed cubes

In a [distributed cluster](../distributed/distributed_intro), each data cube maintains its
own aggregate cache. As described in
[Distributed Post-Processors](../distributed/distributed_post_processors), distributed
prefetchers can be used to pre-compute aggregates on the data cubes before sending them to the
query cube. This way, the most popular intermediate results are served from the data cubes' caches
instead of being recomputed on every query.

The aggregate cache can also be configured on the query cube itself, in which case it caches the
results assembled from the data cubes' contributions.

## Best practices

* **Enable the cache only when measurements show repeated `(location, measure)` accesses.** A
  disabled cache (the default) costs nothing; an oversized cache wastes heap.
* **Restrict caching to expensive measures** with `cachingOnlyMeasures(...)`. Caching cheap
  aggregates pollutes the cache and pushes useful entries out via LRU eviction.
* **Be especially careful with vector measures.** Their entries are large; either exclude them
  with `cachingAllMeasuresBut(...)`, or size the cache conservatively.
* **Use `withSize(0)`** to share computation between concurrent identical queries without retaining
  the result. For example, in write-heavy workloads where any stored result would be
  invalidated almost immediately.
* **Restrict to active branches** with `cachingOnlyBranches(...)` when only a subset of branches
  is queried interactively.