Analysis Hierarchies
The concept of this page is an advanced concept of Atoti and should only be tackled once the reader is familiar with basic OLAP concepts and the associated terminology.
The reader of this document is notably expected to be familiar with the notions of Hierarchies and Locations.
Analysis Hierarchies are a concept that allows for an Atoti cube to define analytic attributes/levels with a larger degree of freedom : This allows for the definition of attributes (later used in the hierarchical structure of the cube) in a functional way, and does not require these attributes to be based on one of the database fields.
Concept definition and specificities
We call Analysis Hierarchy any hierarchy of an Atoti cube containing at least one level whose members are not created using the cube's schema fact selection fields.
This naming convention is to be opposed to "standard Hierarchies" that are used for binding the database's records to the cube's aggregates. The members of a level in a standard hierarchy are by definition the values of the field that was associated to it.
A Hierarchy is often be assumed to be a standard Hierarchy unless stated otherwise. However, both Analysis Hierarchies and standard Hierarchies are hierarchies of the cube in a structural fashion.
The values of the members of an Analysis Hierarchy are programmatically defined, and can be either static of dynamic. The lack of constraints on the definition allows for a functional definition for its levels, which for example can be based on business requirements rather than the existing data within the database.
Analysis Hierarchies are therefore a powerful tool that allows for the integration of programmatically-defined members to the hierarchical structure of the cube, thus reinforcing the cube's analytic capabilities through the addition of functionally-defined attributes as the hierarchies' levels.
Usage Examples
Here are two use-cases for Analysis Hierarchies :
Introduction of a parameter for a subset of measures
If one's measure definition is parameterized by a static parameter, and said measure is to be visualized for multiple parameter values at the same time, then the attribute corresponding the parameter can be added as a level of an analysis hierarchy, with the parameter values as the level members.
Ex : PnL exploration for VaR breach scenario
When estimating market risk through the computation of the VaR metrics, one of the methods is to use scenario-backed Monte-Carlo simulations. In that case, risk simulations will be stored in the database inside scenario-indexed vectors for improving the computation efficiency of the VaR metrics, so the scenario attribute is not explicitly available through the data, and thus cannot be used as the level of a standard Hierarchy. However, there are various business cases where some metrics are to be analyzed on a per-scenario basis, for example when analyzing why such scenario leads to a VaR increase, or when back testing the VaR model, since this allows to inspect which scenario was the closest to an actual VaR breach.
Therefore, adding an Analysis Hierarchy with the "Scenario" attribute inside a VaR cube will allow to retain the computational optimization of the VaR metrics through vectorization while keeping the database model intact, while allowing for the analysis of some specific metrics per scenario.
Introduction of an additional analysis level
A recurrent need when analyzing complex data is to create, use, and manipulate attributes corresponding to several groupings of the facts following a functional logic that is not available in the database records.
Introducing an Analysis Hierarchy along with a Procedure defining the functional relationships between the facts and the groupings (called Aggregation Procedure) allows for the definition of a fully customizable hierarchy whose members correspond to the desired functional logic, without affecting the database at all.
Ex : Sliding temporal buckets in Liquidity
In liquidity, trade consolidation dates are often grouped in buckets, on which some computations may depend.
The addition of the matching bucket as a pre-computed field of the base through the ETL is not always a satisfying solution as the buckets will be statically stored in that case, when the user can expect the buckets to be computed relative to the time of the analysis, rather than at the time data was inserted. In some cases, individual trades may move to other buckets between those two times, so the analysis would be incorrect.
Adding the "Time Bucket" attribute to the cube through an Analysis Hierarchy allows for a computation of the bucket matching trades at the time of the query, making the query output aligned with the expectations of the user.
Note : Defining a field through the ETL or the sources' calculated column creation methods and using that field as part of a standard hierarchy is the preferred method for defining bucketing attributes.
Analysis Hierarchies should be considered for bucketing when the functional definition of the buckets requires information that cannot be inferred when the data is inserted in the database (Here we need the query time in order to compute the temporal bucket).
Using Analysis hierarchies
Regarding the user of the Cube, an Analysis Hierarchy will have the same behavior as a standard Hierarchy with the notable exception that:
When the cube is queried:
- Expressing the Location on an Analysis Hierarchy on its first level will lead to the replication of the values obtained through the aggregation of the cube facts and the execution of the required post-processors for each of the non-filtered members of the first level of the analysis hierarchy.
- Expressing the Location on an Analysis Hierarchy beyond its first level will always return an empty result for aggregated measures.
These limitations are the direct consequence of the fact that it is no longer possible to infer the relationship
between the records and the aggregates at the queried location if the expressed member on the analysis hierarchy
is not a top level member:
While this limitation seems to reduce the usability of Analysis Hierarchies, there are two ways to work around this limitation and leverage analysis hierarchies as a powerful analysis tool:
- Define post-processed measures explicitly handling the expressed Analysis Hierarchy member: by prefetching the location without any expressed analysis hierarchy and by defining the behavior regarding the expressed Analysis Hierarchy members, it is possible to specify values returned by the post-processor.
- Define a global behavior regarding facts : If, for any record of the database, there is one and only one member of the Analysis Hierarchy for which we want to match the facts to, then it is possible to redefine the logical relationship between facts and aggregates at any aggregation level of the cube. This can be done by implementing an "Aggregation Procedure" and associating it with the analysis hierarchy when defining the cube structure.
Analysis Hierarchies and Copper API
The Copper API allows for the functional declaration of join-based hierarchies or bucketing hierarchies. Both are analysis hierarchies, and come out of the box with their matching aggregation procedures.
Introspecting analysis hierarchies
Analysis hierarchies can have its top levels corresponding to selection fields (like in a standard hierarchy) and its deepest levels defined programmatically (like in an analysis hierarchy).
Such an analysis hierarchy is called "introspecting" analysis hierarchy.
Although the members of the hierarchies are partially matching records, this type of hierarchy is still an Analysis Hierarchy and the above-mentioned specificities still apply here.
Defining an analysis hierarchy in a cube
The addition of an analysis hierarchy inside the structure of a cube is done during its definition like standard hierarchies.
The Atoti Java API provides cube definition fluent builders; adding an Analysis Hierarchy through that API is done via the withAnalysisHierarchy(String hierarchyName, String pluginKey)
method.
All implementations of analysis hierarchies must extend the AAnalysisHierarchy
class. To register it in the registry, the interface should be IAnalysisHierarchy
Note : In order to properly add an analysis hierarchy, one must ensure that the plugin value's
IAnalysisHierarchy
(and optionallyIAnalysisHierarchyDescriptionProvider
) with a matching pluginKey were added to Atoti ServerRegistry
Example :
final IActivePivotInstanceDescription cubeDescription =
StartBuilding.cube("Cube")
.withSingleLevelDimension("id")
.withDimension("dimension")
.withAnalysisHierarchy("analysisHierarchy_example", StringAnalysisHierarchy.PLUGIN_KEY)
.build();
This example defines a cube named "Cube" along with:
- The "id" dimension containing a single standard hierarchy named "id" whose only level, also named "id", contains members matching the values of the "id" field in the cube's selection of the database.
- The "dimension" dimension containing an analysis hierarchy named "analysisHierarchy_example" for which:
- The member data and logic is defined by
IAnalysisHierarchy
implementation with theStringAnalysisHierarchy.PLUGIN_KEY
plugin key. - The structure description is defined by
IAnalysisHierarchyDescriptionProvider
implementation with theStringAnalysisHierarchy.PLUGIN_KEY
plugin key.
- The member data and logic is defined by
Customizing an analysis hierarchy during its definition
By default, the properties and the structure of an Analysis Hierarchy are defined by the matching IAnalysisHierarchyDescriptionProvider
implementation. It is however possible to greatly customize an analysis hierarchy description by using the withCustomizations(Consumer<IHierarchyDescription>)
method
when adding it to the cube description.
Reusing the previous example, we now have:
final IActivePivotInstanceDescription cubeTwoDescription =
StartBuilding.cube("Cube2")
.withSingleLevelDimension("id")
.withDimension("dimension")
.withAnalysisHierarchy("stringAH", StringAnalysisHierarchy.PLUGIN_KEY)
.withCustomizations(
desc -> {
desc.setName("Customized Hierarchy"); // Renames the AH
desc.setAllMembersEnabled(false); // Makes the AH slicing
desc.putProperty("PropertyName", "PropertyValue");
// Adds the PropertyName:PropertyValue pair to the AH's properties
})
.build();
This example defines a cube named "Cube2" along with:
- The "id" dimension containing a single standard hierarchy named "id" whose only level, also named "id", contains members matching the values of the "id" field in the cube's selection of the database.
- The "dimension" dimension containing an analysis hierarchy effectively named "Customized Hierarchy" for which:
- The member data and logic is defined by
IAnalysisHierarchy
implementation with theStringAnalysisHierarchy.PLUGIN_KEY
plugin key. - The structure description is defined by
IAnalysisHierarchyDescriptionProvider
implementation with theStringAnalysisHierarchy.PLUGIN_KEY
plugin key, then altered by the customization method call.
- The member data and logic is defined by
Note : When altering the structure of the hierarchy, make sure that the resulting description is supported by the
IAnalysisHierarchy
implementation.
Implementing Analysis Hierarchies
Before considering implementing and using a custom Analysis Hierarchy, make sure that your use case effectively requires this solution:
- In some cases, data modification or replication in order to introduce an attribute in the database is perfectly acceptable.
- the Copper API already provides a functional API for defining Analysis Hierarchies for joins and bucketing use-cases, the generated hierarchies are backed by aggregation procedures, so there is no need for customizing measures and post-processors in order to use these hierarchies for data exploration.
When defining an analysis hierarchy, one must implement the following IPluginValues
:
IAnalysisHierarchy
responsible for the definition of the hierarchy's members and the related logic.IAnalysisHierarchyDescriptionProvider
responsible for providing a default description of the Analysis Hierarchy.
Example I - Scenario-based PnL extraction for parametric VaR
As presented before, in that use case, the user wishes to be able to view scenario-based metrics for one or several scenario.
Here we consider the example of an extreme case where each scenario bring a constant bias in the output of the simulated data, with each scenario being "compensated" by an opposite scenario.
However, since the VaR is not a linear metric, the worst-case scenario may impact the value of the aggregated VaR as well as the metric value on a granular level.
This showcases the need for the ability to drill down between the output of the different scenario simulated by the risk engine.
We therefore wish to introduce a level corresponding to the "scenario" attribute, without modifying the database nor the vectorized data provided by the Risk Engine.
Analyzing scenario based metrics is only a subset of the actions performed by the risk analysts, who will likely still want to analyze VaR metrics at any business related aggregated level.
The "scenario" analysis hierarchy must therefore only affect analysis if the location is drilled-down at the scenario level by the user. Thus, the hierarchy will not be slicing and have an "ALL" top level, and its second level "Scenario" will contain the names of the scenario used in the VaR simulation.
The following listing is the implementation of the Analysis Hierarchy, with the appropriate members added when the hierarchy is instantiated. In the example, the member list is statically defined, but it is not a necessity.
Note that the listing extends
AAnalysisHierarchy
rather than the base interfaceIAnalysisHierarchy
as this abstract class is the recommended entry point for writing custom analysis hierarchies.
@AtotiExtendedPluginValue(intf = IAnalysisHierarchy.class, key = AHForPnLExtract.PLUGIN_KEY)
public static class AHForPnLExtract extends AAnalysisHierarchy {
protected Collection<String> SCENARIOS =
List.of("SCENARIO 1", "SCENARIO 2", "SCENARIO 3", "SCENARIO 1 EDITED");
public static final String PLUGIN_KEY = "scenarioHierarchy";
/**
* Constructor.
*
* @param info the info about the hierarchy
* @param levelInfos the info about the levels
* @param levelDictionaries the levels' dictionaries
*/
public AHForPnLExtract(
final IAnalysisHierarchyInfo info,
final List<ILevelInfo> levelInfos,
final IDictionary<Object>[] levelDictionaries) {
super(info, levelInfos, levelDictionaries);
}
@Override
public String getType() {
return PLUGIN_KEY;
}
@Override
protected Iterator<Object[]> buildDiscriminatorPathsIterator(final IDatabaseVersion database) {
return SCENARIOS.stream().map(scenarioName -> new Object[] {null, scenarioName}).iterator();
}
}
Then, the default structure of the hierarchy is provided by the following
IAnalysisHierarchyDescriptionProvider
implementation:
@AtotiPluginValue(intf = IAnalysisHierarchyDescriptionProvider.class)
public static class AHDescProviderForPnlExtract extends AAnalysisHierarchyDescriptionProvider {
public AHDescProviderForPnlExtract() {
super();
}
@Override
public String key() {
return AHForPnLExtract.PLUGIN_KEY;
}
@Override
public IAnalysisHierarchyDescription getDescription() {
final IAnalysisHierarchyDescription description =
new AnalysisHierarchyDescription(AHForPnLExtract.PLUGIN_KEY, "Scenario", true);
description.addLevel(new AxisLevelDescription("Scenario", null));
return description;
}
}
Let's consider a simple computation of the Averaged PnL defined by the following snippet :
Copper.sum("SimulatedPnLs")
.map(
(v, cell) ->
cell.writeDouble(v.readVector(0).sumDouble() / v.readVector(0).size()),
ILiteralType.DOUBLE)
.as("averaged simulated pnL")
Here is the output when querying the "Averaged simulated pnL"
measure on the Grand Total of the cube and
as well as the Total drilled-down per scenario :
Trade | Desks | Trader | Scenario | Averaged simulated pnL |
---|---|---|---|---|
AllMember | AllMember/AllMember | AllMember | 0.437 | |
AllMember | AllMember/AllMember | AllMember | SCENARIO 1 | null |
AllMember | AllMember/AllMember | AllMember | SCENARIO 1 EDITED | null |
AllMember | AllMember/AllMember | AllMember | SCENARIO 2 | null |
AllMember | AllMember/AllMember | AllMember | SCENARIO 3 | null |
Although the correct scenario members are shown in the output cellSet, the measure in not computed as the proper way to drill down the scenario had not yet been defined.
Add post-processed measures handling the Analysis Hierarchy drill-down
In order to properly use the analysis hierarchy as a tool for the data exploration, Scenario-based metrics must be defined and implemented:
The user wishes to simply obtain the aggregated PnL figures for a given expressed scenario, and the consolidated sum for all scenarios if none is specified.
Since the sum is a linear operator, the extraction can be done at any level of aggregation, we can therefore compute the extracted aggregates simply from the values of the complete PnL vector at the queried location:
- If the scenario is expressed in the query extract the subVector of PnL simulations matching the scenario
- Otherwise, keep the complete PnL simulation vector :
Here is the implementation of a post-processor that performs as presented
@AtotiExtendedPluginValue(intf = IPostProcessor.class, key = pnlExtractorPP.PLUGIN_KEY)
public static class pnlExtractorPP extends ADynamicAggregationPostProcessor {
public static final String PLUGIN_KEY = "plnExtractor";
/** Constructor. */
public pnlExtractorPP(final String name, final IPostProcessorCreationContext creationContext) {
super(name, creationContext);
}
@Override
protected void evaluateLeaf(
final ILocation leafLocation,
final IRecordReader underlyingValues,
final IWritableCell resultCell) {
final int scenarioLevelCoordinate = pivot.getHierarchicalMapping().getCoordinate(4, 1);
final ILevelInfo levelInfo =
pivot.getHierarchicalMapping().getLevelInfo(scenarioLevelCoordinate);
if (LocationUtil.isAtOrBelowLevel(leafLocation, levelInfo)) {
// The scenario level is expressed
final String scenarioValue = (String) LocationUtil.getCoordinate(leafLocation, levelInfo);
switch (scenarioValue) {
case "SCENARIO 1":
resultCell.write(((IVector) underlyingValues.read(0)).subVector(0, 500));
break;
case "SCENARIO 2":
resultCell.write(((IVector) underlyingValues.read(0)).subVector(500, 500));
break;
case "SCENARIO 3":
resultCell.write(((IVector) underlyingValues.read(0)).subVector(1_000, 500));
break;
case "SCENARIO 1 EDITED":
resultCell.write(((IVector) underlyingValues.read(0)).subVector(1_500, 500));
break;
default:
throw new IllegalStateException("Unexpected scenario.");
}
} else {
resultCell.write(underlyingValues.read(0));
}
}
@Override
public String getType() {
return PLUGIN_KEY;
}
@Override
protected boolean supportsAnalysisLevels() {
return true;
}
}
Note : Once a post-processed measure handling the Analysis Hierarchy is created and defined in the cube, it can be reused via the Copper API when defining new measures. The hierarchy description must however contain the
IAnalysisHierarchy.LEVEL_TYPES_PROPERTY
property with the appropriate value following the"levelName1:type1,levelName2:type2"
convention.
We can now add measures to the cube with the following snippet :
Copper.sum("SimulatedPnLs").as("sum").publish(ctx);
Copper.newPostProcessor(pnlExtractorPP.PLUGIN_KEY)
.withProperty(AAdvancedPostProcessor.UNDERLYING_MEASURES, "sum")
.withProperty(AAdvancedPostProcessor.ANALYSIS_LEVELS_PROPERTY, "Scenario")
.withProperty(IPostProcessor.OUTPUT_TYPE_PROPERTY, ILiteralType.DOUBLE_ARRAY)
.withProperty(
ADynamicAggregationPostProcessor.LEAF_TYPE, ILiteralType.DOUBLE_ARRAY)
.as("PnLExtractedScenario")
.publish(ctx);
Copper.measure("PnLExtractedScenario")
.map(
(v, cell) ->
cell.writeDouble(v.readVector(0).sumDouble() / v.readVector(0).size()),
ILiteralType.DOUBLE)
.as("averaged simulated pnL with Scenario drillDown")
Similarly to the previous section, we can query the "averaged simulated pnL with Scenario drillDown"
measure
on the Grand Total and the Total drilled down per scenario locations :
Trade | Desks | Trader | Scenario | Averaged simulated pnL with Scenario drillDown |
---|---|---|---|---|
AllMember | AllMember/AllMember | AllMember | 0.437 | |
AllMember | AllMember/AllMember | AllMember | SCENARIO 1 | -19999.262 |
AllMember | AllMember/AllMember | AllMember | SCENARIO 1 EDITED | 19998.173 |
AllMember | AllMember/AllMember | AllMember | SCENARIO 2 | -9997.965 |
AllMember | AllMember/AllMember | AllMember | SCENARIO 3 | 10000.803 |
In this example, the expansion on the Scenario level showcases a strong bias related to the scenarios that would be very hard to analyze otherwise.
Example II - Sliding temporal bucketing
This specific use-case can be done with the usage of the Copper API's bucketing possibilities. This example is however here to showcase an implementation of an analysis hierarchy coupled with an aggregation procedure.
We greatly encourage the developers to use the Copper API whenever possible.
We consider here the following use-case:
When performing the analysis of the ongoing cash flow (liquidity) of our assets, we wish to define metrics that have a rule set which depends on the time interval between the analysis time and the maturity term of our assets.
Here we need to be able to split trades between those whose term maturity is in less than a year, and the other. This classification is called "Maturity Term Category".
We also wish to further split our trades in the four following categories:
- Trades whose maturity term is in less than a week
- Trades whose maturity term is in more than a week and less than a month
- Trades whose maturity term is in more than a month and less than a year
- Trades whose maturity term is in more than a year
This attribute will be named by its functional definition, e.g.: "Maturity Term Bucket".
Since the correct value of the bucket for a trade depends on the time of the analysis, it cannot be modeled by a static field of database added on data insertion.
The usage of an Analysis hierarchy on the contrary allows for a dynamic assignment of the records to a given member of the analysis hierarchy, as this assignment can either be handled by post-processors, or by a procedure like we'll see in this use case.
With the functional requirements clarified, one can envision the following structure for our "Maturity" Analysis Hierarchy :
Once the members and the structure of the hierarchy are clearly defined, the implementations quickly results as the following:
@AtotiExtendedPluginValue(intf = IAnalysisHierarchy.class, key = TimeBucketHierarchy.PLUGIN_TYPE)
public static class TimeBucketHierarchy extends AAnalysisHierarchy {
private static final long serialVersionUID = 6_01_00L;
/** Plugin type. */
public static final String PLUGIN_TYPE = "TIME_BUCKET";
/** Time bucketer. */
protected final IBucketer<Long> bucketer;
public TimeBucketHierarchy(
final IAnalysisHierarchyInfo info,
final List<ILevelInfo> levelInfos,
final IDictionary<Object>[] levelDictionaries) {
super(info, levelInfos, levelDictionaries);
this.bucketer = new TimeBucketer();
}
@Override
protected Iterator<Object[]> buildDiscriminatorPathsIterator(final IDatabaseVersion database) {
// We don't generate the unused members as they don't correspond to any possible trade.
return bucketer.getAllBuckets().stream()
.map(
bucket ->
new Object[] {null, bucket.equals("1+Y") ? "OTHER" : "CURRENT_YEAR", bucket})
.iterator();
}
@Override
public String getType() {
return PLUGIN_TYPE;
}
}
@AtotiPluginValue(intf = IAnalysisHierarchyDescriptionProvider.class)
public static class TimeBucketHierarchyDescriptionProvider extends PluginValue
implements IAnalysisHierarchyDescriptionProvider {
/** Time bucketer. */
protected final IBucketer<?> bucketer;
public TimeBucketHierarchyDescriptionProvider() {
super();
this.bucketer = new TimeBucketer();
}
@Override
public String key() {
return TimeBucketHierarchy.PLUGIN_TYPE;
}
@Override
public IAnalysisHierarchyDescription getDescription() {
final var desc =
new AnalysisHierarchyDescription(TimeBucketHierarchy.PLUGIN_TYPE, "Maturity", true);
final var levelDesc = new AxisLevelDescription("Maturity term category", null);
levelDesc.putProperty(
IAxisLevelDescription.ANALYSIS_LEVEL_TYPE_PROPERTY, String.valueOf(Types.TYPE_STRING));
final var levelDescTwo = new AxisLevelDescription("Maturity term bucket", null);
levelDescTwo.setComparator(createComparatorDescription());
levelDescTwo.putProperty(
IAxisLevelDescription.ANALYSIS_LEVEL_TYPE_PROPERTY, String.valueOf(Types.TYPE_STRING));
desc.addLevel(levelDesc);
desc.addLevel(levelDescTwo);
return desc;
}
private IComparatorDescription createComparatorDescription() {
final var comparator = new ComparatorDescription(IComparator.CUSTOM_PLUGIN_KEY);
final var compOrder =
new ComparatorOrderDescription("firstObjects", bucketer.getAllBuckets());
comparator.setOrders(List.of(compOrder)); // Use the bucketer ordering rather than the
// default one (alphabetical).
return comparator;
}
}
Most of the time-handling logic is delegated to an IBucketer
component with the clear functional purpose to classify
timestamps into a matching entry of a list of buckets.
/**
* Simple Time bucketer, that organizes time buckets separated by their time boundary. It comes
* with a predefined collection of buckets.
*
* @author ActiveViam
*/
public static class TimeBucketer extends ATimeBucketer {
/** Constructor. */
public TimeBucketer() {
createBuckets();
}
@Override
public NavigableMap<Long, Object> createBucketMap(final Long timeReference) {
return createBucketMap(timeReference, buckets);
}
/** Create default buckets. */
protected void createBuckets() {
createBucket("1W", ChronoUnit.WEEKS, 1);
createBucket("1M", ChronoUnit.MONTHS, 1);
createBucket("1Y", ChronoUnit.YEARS, 1);
createBucket("1+Y", ChronoUnit.FOREVER, 1);
}
@Override
public NavigableMap<Long, Object> createBucketMap(
final Long timeReference, final List<Object> bucketSelection) {
final NavigableMap<Long, Object> map = new TreeMap<>();
for (int m = bucketSelection.size() - 1; m >= 0; m--) {
final Object bucket = bucketSelection.get(m);
final long boundary = getInvertedBoundary(timeReference, bucket);
map.put(boundary, bucket);
}
return map;
}
}
There is an injection relationship between the records and the current maturity buckets thanks to the data provided by tradeDate and the tradeMaturity fields : for each record, there is only one bucket of the Analysis Hierarchy that matches this fact.
We'll be able to define a procedure that enables the Bucket level as a cube level which the user can drill down without requiring a customization of the metrics:
This type of procedure is called an "Aggregation Procedure" and can be specified when defining the hierarchy in the Cube's description builders as following:
// [...cube Builder started]
.withDimension("Maturity")
.withAnalysisHierarchy("CurrentMaturityBucket", TimeBucketHierarchy.PLUGIN_TYPE)
.withAggregationProcedure(TimeBucketHierarchy.PLUGIN_TYPE)
.withUnderlyingLevels(LevelIdentifier.simple("TradeDate"))
.end()
Definition of Aggregation Procedures
Aggregation Procedures are a completely different concept from Aggregation Functions
Aggregation Procedure is a complex topic and requires prior knowledge about some already advanced topics of the ActivePivot querying workflow components and concepts such as
Prefetchers
.
Aggregation procedures are classes meant to centralize the logic for aggregating measures on analysis hierarchies following a user-defined logic. When an Analysis Hierarchy is defined as using an aggregation procedure, there is no need to write one post-processor per measure to use with the hierarchy.
Aggregation procedures work by adding optional steps to an ActivePivot query :
Transform the underlying query locations if needed : In order to be able to compute the aggregates, the procedure can alter the queried location for measures, notably by forcing wildcard coordinates on the "underlying levels" defined in the builders by
withUnderlyingLevels(LevelIdentifier... levels)
if an analysis hierarchy handle by the procedure is expressed.This behavior is implemented by the `buildRequestScope([...])` method of the procedure
Here, in order to compute a bucket value, the `tradeDate` attribute must not be aggregated until we've associated a
time bucket to the records. In order to do so, the scope of the prefetches on the aggregate provider was alter to
include a range location on the `tradeDate` level.
The aggregates will then be re-aggregated by the last step of the procedure.Retrieve additional data from the database : Aggregation procedures can asynchronously perform explicit database queries and use the output of said queries in the last step of computation. This is particularly useful when some member-related information is not available in the cube Selection.
This behavior is implemented by the `getDatabasePrefetch([...])` method of the procedure
Here, there is no need for additional information, so no database query is required.
Map the aggregated values to Analysis hierarchies members : The heart of the procedure, that will compute the matching member of the analysis hierarchies handle by the procedure, for a given input location and a given context.
This behavior is implemented by the
mapToFullLocation([...])
method of the procedureHere, the output location will be computed by comparing the
tradeDate
values in the locations to the state of the buckets in the procedure's context.
As aggregation procedures alter the query scope and steps, adding an aggregation procedure to an analysis hierarchy can impact query performance.
As a rule of thumb try to minimize the amount of required levels.
For more information on the subject, please refer to the com.quartetfs.biz.pivot.cube.hierarchy.axis.IAnalysisAggregationProcedure
's
API reference
Implementing an Aggregation Procedure
Here is a commented implementation of the aggregation procedure that will generalize the computation of the matching "Maturity Term Buckets" and "Maturity Term Category" for a given "TradeDate".
Aggregation Procedures must implement the IAnalysisAggregationProcedure
interface.
They rely on ActiveViam Registry, but through a plugin value referring to the factory instantiating the Procedure. Factories
implement the interface IAnalysisAggregationProcedure.IFactory
, a specialization of the Plugin Value contract.
IAnalysisAggregationProcedure
implementation:
/**
* Aggregation procedure dynamically contributing time members to the time buckets.
*
* <p>It will allow the time buckets to exist for any primitive measure, making their values
* available for post-processors based on it.
*/
public static class TimeBucketProcedure
extends ABasicAggregationProcedure<
IPair<NavigableMap<Long, Object>, Collection<ICondition>>> {
/** The ordinal of the bucket level. */
public static final int BUCKET_LEVEL_ORDINAL = 2;
/** Instance mapping a date to a bucket. */
protected final IBucketer<Long> bucketer;
/** Info about the level with values to bucket. */
protected final ILevelInfo bucketedLevel;
/**
* Function that takes a cube filter as argument and that creates a {@link ICondition} from it.
* It is used to find if a bucket member is granted or not.
*/
protected final Function<ICubeFilter, ICondition> filterCreator;
public TimeBucketProcedure(
final IAnalysisAggregationProcedureDescription description,
final IHierarchy hierarchy,
final IAggregationHierarchyCreationContext context) {
super(description, hierarchy, context);
this.bucketer = new TimeBucketer();
this.bucketedLevel = this.underlyingLevels.get(0);
this.filterCreator = filter -> TrueCondition.TRUE; // Not implemented for simplicity’s sake
}
/**
* Gets the level info of the underlying bucket level.
*
* @return level info
*/
protected ILevelInfo getBucketLevel() {
return this.getHierarchy().getLevels().get(BUCKET_LEVEL_ORDINAL);
}
/**
* Creates a request location by adding the required levels to map ActivePivot points to members
* of the handled analysis hierarchies. Example : If the location is queried on a specific
* bucket, we need to express the "TradeDates" level in the requested location in order to
* determine which trade goes in the expressed bucket.
*
* @return {@code true} if this procedure must be used to handle the location and measure.
*/
@Override
public boolean buildRequestScope(
final ILocationBuilder location,
final ICubeFilter securityAndFilters,
final IJoinMeasure joinMeasure,
final IActivePivotVersion pivot,
final Collection<IMeasureMember> underlyingMeasures) {
if (location.getLevelDepth(this.getHierarchyOrdinal()) > 1) {
expandOnUnderlyingLevels(location);
// The hierarchy is expressed in the query location, get the underlying elements for
// bucketing
return true;
} else if (expressHandledHierarchies(securityAndFilters)) {
// The hierarchy is expressed in the query filter, get the underlying elements for bucketing
expandOnUnderlyingLevels(location);
location.expandToLevel(getBucketLevel());
return true;
} else { // The hierarchy is absent, dynamic bucketing is not wanted
return false;
}
}
/**
* Creates the context that will be passed as an argument to the mapToFullLocation([...])
* method. The "context" is the information provided during the procedure execution in order to
* determine the appropriate member to choose for each point location. Here, the context
* contains the Map of buckets and the query conditions.
*/
@Override
public IPair<NavigableMap<Long, Object>, Collection<ICondition>> createContext(
final IActivePivotVersion pivot,
final IScopeLocation primitiveRetrievalScope,
final IScopeLocation targetScope,
final IScopeLocation prefetchLocation,
final IDatabaseRetrievalResult databaseResult) {
final ICondition filterCondition =
this.filterCreator.apply(CubeFilter.getFromContext(pivot.getContext()));
ICondition scopeCondition;
final Object scopeCoordinateFirstLevel =
targetScope.getCleansedLocation().getCoordinate(getHierarchyOrdinal(), 1);
if (scopeCoordinateFirstLevel == null) {
scopeCondition = TrueCondition.TRUE;
} else {
final Object scopeCoordinateSecondLevel =
targetScope.getCleansedLocation().getCoordinate(getHierarchyOrdinal(), 2);
scopeCondition = new InCondition(scopeCoordinateFirstLevel);
if (scopeCoordinateSecondLevel != null) {
scopeCondition =
new AndCondition(
List.of(scopeCondition, new InCondition((scopeCoordinateSecondLevel))));
}
}
return new Pair<>(createBucketMap(), Arrays.asList(filterCondition, scopeCondition));
}
/**
* The heart of the procedure. Builds a full Location from a provided readable Location without
* analysis levels and the provided context.
*/
@Override
public void mapToFullLocation(
final IPair<NavigableMap<Long, Object>, Collection<ICondition>> context,
final IPointLocationReader primitiveRetrievalLocation,
final IAggregatesLocationResult aggregates,
final IPointLocationBuilder targetLocationBuilder,
final ResultConsumer consumer) {
// Compute the target bucket based on the current bucketed level value
final Date bucketedMember =
((Date)
primitiveRetrievalLocation.getCoordinate(
this.bucketedLevel.getHierarchyInfo().getOrdinal() - 1,
this.bucketedLevel.getOrdinal()));
final NavigableMap<Long, Object> bucketMap = context.getLeft();
final Object[] memberArray;
// Find the correct bucket entry that matches the time
final Map.Entry<Long, Object> ceilingEntry = bucketMap.floorEntry(bucketedMember.getTime());
if (ceilingEntry == null) {
memberArray = null;
} else {
memberArray =
ceilingEntry.getValue().equals("1+Y")
? new Object[] {"OTHER", ceilingEntry.getValue()}
: new Object[] {"CURRENT_YEAR", ceilingEntry.getValue()};
}
if (memberArray == null) {
return;
}
final Object category = memberArray[0];
final Object bucket = memberArray[1];
// Make sure the computed bucket member matches the query conditions
if (bucket == null || !context.getRight().stream().allMatch(c -> c.evaluate(bucket))) {
return;
}
// Keep the bucket if it is compatible with the current bucket coordinate
final ILevelInfo bucketLevel = getBucketLevel();
final Object bucketMember =
targetLocationBuilder.readCoordinate(
bucketLevel.getHierarchyInfo().getOrdinal() - 1, bucketLevel.getOrdinal());
if (bucketMember != null) {
// A bucket coordinate already exists. Only keep our bucket if it matches the current one
if (bucketMember.equals(bucket)) {
consumer.accept(targetLocationBuilder, -1);
}
} else {
// The bucket coordinate is not set. We need to set it when asked.
targetLocationBuilder.setCoordinate(getHierarchyOrdinal(), 1, category);
// The maturity category may be set but not the bucket, we check if the location can be set
if (targetLocationBuilder.getRangeCoordIndex(getHierarchyOrdinal(), 2) >= 0) {
targetLocationBuilder.setCoordinate(getHierarchyOrdinal(), 2, bucket);
}
consumer.accept(targetLocationBuilder, -1);
}
}
/**
* Dynamically organize the buckets that are visible in the bucket hierarchy, and map those into
* a navigable map so that finding the bucket associated with a date is very fast.
*
* <p>Creating the map of buckets is expensive and should be done only once per query, consider
* using the QueryCache to safely do so.
*
* @return map of buckets
*/
protected NavigableMap<Long, Object> createBucketMap() {
final IAxisHierarchy bucketHierarchy = (IAxisHierarchy) this.getHierarchy();
final List<? extends IAxisMember> visibleMembers =
bucketHierarchy.retrieveMembers(BUCKET_LEVEL_ORDINAL);
final long timeReference = System.currentTimeMillis();
final List<Object> visibleBuckets = new ArrayList<>();
for (int m = visibleMembers.size() - 1; m >= 0; m--) {
final IAxisMember member = visibleMembers.get(m);
final Object discriminator = member.getDiscriminator();
visibleBuckets.add(discriminator);
}
return this.bucketer.createBucketMap(timeReference, visibleBuckets);
}
/** Returns an optional request for rows in a store. */
@Override
public DatabasePrefetchRequest getDatabasePrefetch(
final ILocation prefetchLocation, final ICubeFilter filter) {
return null; // No additional is required from the database to determine the correct bucket
}
}
IAnalysisAggregationProcedure.IFactory
implementation:
@AtotiPluginValue(intf = IAnalysisAggregationProcedure.IFactory.class)
public static class TimeBucketFactory extends PluginValue
implements IAnalysisAggregationProcedure.IFactory<
IPair<NavigableMap<Long, Object>, Collection<ICondition>>> {
@Override
public String description() {
return "Factory for Time Bucketing procedures";
}
@Override
public String key() {
return TimeBucketHierarchy.PLUGIN_TYPE;
}
@Override
public IAnalysisAggregationProcedure<IPair<NavigableMap<Long, Object>, Collection<ICondition>>>
create(
final IAnalysisAggregationProcedureDescription description,
final Collection<? extends IHierarchy> hierarchies,
final IAggregationHierarchyCreationContext creationContext) {
assert hierarchies.size() == 1;
return new TimeBucketProcedure(description, hierarchies.iterator().next(), creationContext);
}
}
Misc implementation topics
Migrating to 6.0.X
The 6.0.0 version of Atoti Server introduced a new interface IAnalysisHierarchyDescriptionProvider
whose purpose
is explicit : Providing a description of the analysis hierarchy.
That responsibility is removed from the AAnalysisHierarchy
class contract,
which used to be the default provided base for IAnalysisHierarchy
implementations.
The following methods describing the hierarchy and its structure are now deprecated:
AAnalysisHierarchy.getLevelsCount()
AAnalysisHierarchy.getLevelComparator(int ordinal)
AAnalysisHierarchy.getLevelField(int ordinal)
AAnalysisHierarchy.getLevelType(int ordinal)
AAnalysisHierarchy.getLevelName(int ordinal)
AAnalysisHierarchy.getLevelFormatter(int ordinal)
The definition of these attributes of the analysis hierarchy is now expected to be provided in the
IAnalysisHierarchyDescription
returned by theIAnalysisHierarchyDescriptionProvider#getDescription()
implementation with the same plugin key as the Analysis Hierarchy.
Migration Example
Let's consider a multi-level analysis hierarchy with a static set of members and a custom comparator on the second level:
@QuartetExtendedPluginValue(
intf = IMultiVersionHierarchy.class,
key = StringAnalysisHierarchyOld.PLUGIN_KEY)
public static class StringAnalysisHierarchyOld extends AAnalysisHierarchy {
private static final long serialVersionUID = 20191122_02L;
public static final String PLUGIN_KEY = "STATIC_STRING";
protected final List<Object> levelOneDiscriminators;
protected final List<Object> levelTwoDiscriminators;
/**
* Default constructor.
*
* @param hierarchyInfo information about this hierarchy
*/
public StringAnalysisHierarchyOld(
IAnalysisHierarchyInfo hierarchyInfo,
List<ILevelInfo> levelInfos,
IWritableDictionary<Object>[] dictionaries) {
super(hierarchyInfo, levelInfos, dictionaries);
this.levelOneDiscriminators = Arrays.asList("A", "B");
this.levelTwoDiscriminators = Arrays.asList("X", "Y", "Z");
}
@Override
public String getType() {
return PLUGIN_KEY;
}
@Override
public int getLevelsCount() {
return this.isAllMembersEnabled ? 3 : 2;
}
@Override
public IComparator<Object> getLevelComparator(int levelOrdinal) {
// The second level is expected to use the reverse natural order
if (levelOrdinal == 2) {
return new ReverseOrderComparator();
} else {
return new NaturalOrderComparator();
}
}
@Override
protected Iterator<Object[]> buildDiscriminatorPathsIterator(IDatabaseVersion database) {
final List<Object[]> resultList = new ArrayList<>();
for (Object lvlOneDiscriminator : levelOneDiscriminators) {
for (Object lvlTwoDiscriminator : levelTwoDiscriminators) {
final Object[] path = new Object[getLevelsCount()];
path[path.length - 2] = lvlOneDiscriminator;
path[path.length - 1] = lvlTwoDiscriminator;
resultList.add(path);
}
}
return resultList.iterator();
}
}
The implementation for this hierarchy post 6.1 will be :
for the hierarchy implementation
@AtotiExtendedPluginValue(
intf = IAnalysisHierarchy.class,
key = StringAnalysisHierarchy.PLUGIN_KEY)
public static class StringAnalysisHierarchy extends AAnalysisHierarchy {
public static final String PLUGIN_KEY = "STATIC_STRING";
protected final List<Object> levelOneDiscriminators;
protected final List<Object> levelTwoDiscriminators;
/**
* Default constructor.
*
* @param hierarchyInfo information about this hierarchy
*/
public StringAnalysisHierarchy(
final IAnalysisHierarchyInfo hierarchyInfo,
final List<ILevelInfo> levelInfos,
final IDictionary<Object>[] levelDictionaries) {
super(hierarchyInfo, levelInfos, levelDictionaries);
this.levelOneDiscriminators = Arrays.asList("A", "B");
this.levelTwoDiscriminators = Arrays.asList("X", "Y", "Z");
}
@Override
protected Iterator<Object[]> buildDiscriminatorPathsIterator(
final IDatabaseVersion database) {
final List<Object[]> resultList = new ArrayList<>();
for (final Object lvlOneDiscriminator : levelOneDiscriminators) {
for (final Object lvlTwoDiscriminator : levelTwoDiscriminators) {
final Object[] path = new Object[base.getLevelInfos().size()];
path[path.length - 2] = lvlOneDiscriminator;
path[path.length - 1] = lvlTwoDiscriminator;
resultList.add(path);
}
}
return resultList.iterator();
}
@Override
public String getType() {
return PLUGIN_KEY;
}
}for the description provider implementation
@AtotiPluginValue(intf = IAnalysisHierarchyDescriptionProvider.class)
public static class StringAnalysisHierarchyDescriptionProvider
extends AAnalysisHierarchyDescriptionProvider {
@Override
public IAnalysisHierarchyDescription getDescription() {
final var desc = new AnalysisHierarchyDescription(key(), "stringHierName");
final var levelDesc = new AxisLevelDescription(STATIC_STRING, null);
final var levelDescTwo = new AxisLevelDescription(STATIC_STRING + "_1", null);
levelDescTwo.setComparator(
new ComparatorDescription(IComparator.DESCENDING_NATURAL_ORDER_PLUGIN_KEY));
desc.addLevel(levelDesc);
desc.addLevel(levelDescTwo);
return desc;
}
@Override
public String key() {
return TestAnalysisHierarchyCompatibility.StringAnalysisHierarchy.PLUGIN_KEY;
}
}