Skip to main content

Create and use a DirectQuery local cache

This page describes how to create and use a DirectQuery local cache.

To learn what is a DirectQuery local cache, check the DirectQuery Local Cache page.

warning

This feature is available since Atoti 6.1.9 and is an experimental feature. To enable the feature, set the System Property activeviam.feature.experimental.directquery_local_cache.enabled to true.

How to define a local cache

Let's first define a schema with 2 tables: the sales table and a side table made of products.

final TableDescription salesTable =
discoverer.discoverTable(new ClickhouseTableId("TUTORIAL", "SALES"));
final TableDescription productsTable =
discoverer.discoverTable(new ClickhouseTableId("TUTORIAL", "PRODUCTS")).toBuilder()
.keyFieldNames(List.of("PRODUCT_ID", "CATEGORY"))
.build();
SchemaDescription schema =
SchemaDescription.builder().externalTables(List.of(salesTable, productsTable)).build();

In this example we will define a local cache on the products table. Cache definition is done per table.

final SingleTableCacheDescription productsCache =
SingleTableCacheDescription.builder()
.cacheName("MyCache")
.tableName("PRODUCTS")
.cachePartitioningFields(List.of("CATEGORY"))
.build();

Note that to define a cache on one table we must define a "partitioning". This defines which partition of the external can be downloaded at a time. See the doc about cache partition for more details. Remember that the fields used to partition the table must be part of the key fields of the table.

Also note that we define a name for the cache. This is not required (it defaults to the name of the table) but this is how we will interact with the cache later.

Once we have defined the cache for all the tables we want, we can define the global cache:

final CacheDescription cacheDescription =
CacheDescription.builder().singleTableCacheDescriptions(List.of(productsCache)).build();

and then use this cache in the application:

final Application app =
Application.builder(connector)
.schema(schema)
.managerDescription(managerDescription)
.cacheDescription(cacheDescription)
.build();

That's it, now get-by-key queries, list queries and distinct queries will use the local cache to speed up the queries when possible.

How to customize a local cache

Eviction policy

Each individual cache on a table can have its own eviction policy. This is defined when creating the cache by setting its capacity. Currently there are two eviction policies available:

  • One based on the maximum number of partitions in the cache;

final SingleTableCacheDescription productsCache =
SingleTableCacheDescription.builder()
.tableName("PRODUCTS")
.cachePartitioningFields(List.of("DATE"))
.capacity(MaxPartitionsCount.of(20))
.build();

  • One based on the maximum number of rows.

final SingleTableCacheDescription productsCache =
SingleTableCacheDescription.builder()
.tableName("PRODUCTS")
.cachePartitioningFields(List.of("DATE"))
.capacity(MaxLinesCount.of(100))
.build();

When the cache reaches this limit, the oldest information is evicted to make room for new data.

By default, the cache policy is set on the number of partitions, with a capacity of 10.

Select only some fields

By default, all the fields of the table are added to the cache. It is possible to select only some fields to be cached for a smaller cache.

final SingleTableCacheDescription productsCache =
SingleTableCacheDescription.builder()
.tableName("PRODUCTS")
.cachePartitioningFields(List.of("DATE"))
.selectedFields(List.of("PRODUCT_ID", "DATE", "UNIT_PRICE"))
.build();

Add secondary indexes to improve performance

Like for an in-memory table, it is possible to add secondary indexes on the cache to improve the performance of queries. Indexes are added on list of fields and several indexes can be added.

Note that the fields included in the indexes must be dictionarized.

Here is an example of how to add 2 secondary indexes:

final SingleTableCacheDescription productsCache =
SingleTableCacheDescription.builder()
.tableName("PRODUCTS")
.cachePartitioningFields(List.of("DATE"))
.secondaryIndexes(List.of(List.of("DATE", "PRODUCT_ID"), List.of("DATE", "CATEGORY")))
.build();

How to use the cache manager

The cache will be used automatically by Atoti when possible. The cache manager lets you manipulate the cache itself. It can be retrieved on the database:

final IDatabaseCacheManager cacheManager = app.getDatabase().getCacheManager();

When using the cache manager, you will need to use the name of the cache you defined earlier.

The cache manager can be used to prefetch a cache partition. This will download the partition in the cache without waiting for a query to trigger it. This is useful when we know that a partition will be needed in the near future and we want to prefetch it to speed up the next queries.

For instance, here we have defined our cache partitioned per category and we want to prefetch the partition corresponding to the category "Furniture":

final CompletionStage<Void> prefetchOperation =
cacheManager.prefetchCacheAsync(
app.getDatabase().getMasterHead(), "MyCache", CachePartition.of("Furniture"));

Note that the prefetch is asynchronous, so if you need the prefetch to complete before continuing, you must wait for the completion stage to complete:

prefetchOperation.toCompletableFuture().get();

The manager can also be used to flush the cache and remove all the partitions from the cache.

cacheManager.flush("MyCache");

Prefetch a cache partition in a Post-Processor (advanced)

warning

This is an advanced feature and should be used with care!

Get-by-key queries and list queries on side tables generally happen in a post-processor to retrieve additional data for the calculation. As the computation phase of the post-processor is done after the main retrieval on the external database, it can be slow to wait until this moment to trigger the query for the cache feeding. To avoid this, it is possible to prefetch the cache partition earlier so the prefetching happens concurrently with the main retrieval and the overall query is faster.

To do that, it is possible to use the custom prefetcher DatabaseCachePrefetcher. This prefetcher can be used in the post-processor to prefetch the cache partition corresponding to the coordinates of the query. To convert the coordinates of the query to the cache partition, you can use an ILocationToCachePartitionConverter. A core example called BestEffortLocationToCachePartitionConverter is available to do it.

Here is an example of how to use these in a post-processor:

@Override
protected List<IPrefetcher<?>> initializePrefetchers(final Properties properties) {
final BestEffortLocationToCachePartitionConverter converter =
new BestEffortLocationToCachePartitionConverter(
List.of(LevelIdentifier.simple(CACHE_PARTITIONING_COLUMN)));
final DatabaseCachePrefetcher prefetcher =
new DatabaseCachePrefetcher(this.cacheManager, getActivePivot(), CACHE_NAME, converter);
final List<IPrefetcher<?>> allPrefetchers =
new ArrayList<>(super.initializePrefetchers(properties));
allPrefetchers.add(prefetcher);
return allPrefetchers;
}