Skip to main content

What's new

This section provides a brief overview of the new features and improvements in the latest versions of ActivePivot.

For a detailed list of all changes, see our Changelog.

5.11

New features

  • Data Export Service: The data export service allows you to download the result of MDX queries to a stream or directly export them to files in the server or in the cloud. Three formats are available: Arrow, csvTabular and csvPivotTable. The exports can be customized. The basic export order will define a mdx query to run with its context values and an output configuration to specify the format of the output and for an export the file configuration.

  • Application Performance Monitoring (APM): ActivePivot APM is a solution for monitoring the healthiness and performance of ActivePivot instances. It provides several features easing the support work, and reducing the burden of maintaining and troubleshooting ActivePivot. The APM library provides the following main features:

    • Distributed log tracing: Allows to easily identify all the logs related to an operation ( e.g. query execution) across nodes
    • Query monitoring: Monitor in-progress, successful or failed MDX queries, their associated execution logs, and the historical performances
    • Cube size and usage monitoring: Display the trend of cube aggregates provider sizes over time.
    • Datastore monitoring: Display the trend of datastore sizes over time; Monitor the datastore transaction times; View successful and failed CSV loading, and their associated execution logs.
    • Netty monitoring: Monitor the size of data transferred between nodes, in terms of single query execution, all queries executed by a user, or all activities, etc.
    • Monitoring activities by user: Monitor activities (e.g. query execution) on a user level.
    • JVM monitoring: Overall JVM status, including CPU, heap and off-heap memory, threads, GC, etc.
  • MdxQueryBouncer: The MdxQueryBouncer allows to limit the number of concurrently running MDX queries. Query bouncing can be configured using the following properties:

    • ActiveViamProperty#MDX_CONCURRENT_QUERY_LIMIT_PROPERTY : Specifies the maximum number of running MDX queries. Note that this will include mdx queries execution and update, and mdx stream registration and update.
    • ActiveViamProperty#MDX_CONCURRENT_QUERY_WAIT_LIMIT_PROPERTY : Specifies a timeout while waiting for a permit to execute an MDX query

    The above limits are also configurable dynamically using the MBean MdxQueryBouncer.

  • Content Service Migration Tool: The utility class ActivePivotContentServiceMigrationUtil provide a way to migrate the content service content from xml to Json. Starting from 6.0, the content server stores its data in Json. The class includes several methods called xmlToJson that can transform data from xml to an equivalent Json format. They can operate with java strings and instances of InputStream or Reader. Output can be returned as a java string. Alternatively, it can be written to arbitrary OutputStream or Writer. Note that not objects out of the scope of ActivePivot will cause the method to throw.

  • Source Parsing Report: ActivePivot data source source API can now generate a parsing report detailing any error encountered during the parsing process. Now, the method ISource#fetch(...) returns a parsing reports provided that the property PARSING_REPORT_ENABLED is set to true and the underlying source support such reporting.

    Note that only the CSV Source and JDBC Source provide such reporting.

    For more details, see the documentation of the CSV source and of the JDBC source.

  • Excel Add-in: the Excel add-in is now part of the Activepivot core. This Excel extension allows users to:

    • see the Mdx query created by Excel
    • perform drillthroughs
    • move from slicers to slicers
    • modify context values.

    The installer of the Excel add-in can be found on Artifactory. The context service has been integrated to the ActivePivot code base as well.

  • Virtual Hierarchies: This is an experimental feature available in 5.11.0 but still in development. A new hierarchy implementation to reduce the memory footprint and speedup the application. It can be used on high cardinality hierarchies in exchange for a reduced set of features. These hierarchies are inherently not able to access their members, which restricts the operations that can be performed on such hierarchies. These restrictions consist of:

    • Lead, Lag, PevMember and NextMember functions are not allowed on such hierarchies
    • Axis expressions including a virtual hierarchy must include the NON EMPTY key word. In addition, if the expression describes a member expression the full member path must be provided.
    • A Virtual hierarchy must not be slicing
    • Virtual hierarchy levels cannot be used as distributing fields Notice that although a query including a given hierarchy as virtual or not virtual will output the same results, internally the query execution may be slightly different.
  • Admin UI: This UI allows to browse the datastore.

Improvements

  • Performance improvement in Post Processors and Copper Measures: All abstract post processor classes have been rewritten. The new versions have the suffix V2. The goal of this rewriting is to prevent the boxing of primitive values into Objects, and reduce the memory footprint of queries. Both the initialization and the evaluation phases of the post-processors were redesigned, and more details can be found in the migration notes. This also lead to slight modifications of the Copper API.

  • Empty partitions automatically dropped: Empty partitions will be dropped automatically by the datastore at the end of transactions.

  • Aliases on Datastore Queries: Fields of a Datastore query can now be aliased. The String representing the expression of the field can be replaced with a SelectionField containing the alias.

5.10

New features

  • User Defined Aggregate Functions on Measures: It is now possible for user-defined aggregate functions to work on measures (instead of facts). This can be acheived using the Copper API. The documentation is available here.
  • Copper Join hierarchy with multiple levels: The Copper API now lets you define Join hierarchies with several levels.
  • Option to set the visibility of hierarchies and dimensions through context values: the MDX context now has a feature that can override the visibility of a hierarchy or a dimension. This feature can be used through MdxContext::setHierarchyVisibility() or through the MDX context builder via IMdxContextBuilder::overrideHierarchyVisibility() (analogous methods exist for dimensions).
  • Copper Store Lookups: The Copper API lets you define store look ups measures that run a get-by-key query on the selected store.

Improvements

  • Chunks gained a new compression method: Frequency compression. When the Datastore detects a value repeated many times in a chunk, the chunk is compressed to store only explicit values, while recording the implicit values repeated many times.
    The compression happens when a value appears more than x % in the chunk. x can be specified as a ratio through the property ActiveViamProperty#CHUNK_FREQUENCY_COMPRESSION_RATIO_PROPERTY. The compression can be enabled or disabled for each of the following five data types: int, float, long, double and object. This can be controlled through the property: ActiveViamProperty#ENABLED_FREQUENCY_COMPRESSIONS_PROPERTY.
  • When a data node is being shutdown, it will send a GoodbyeMessage to the query node which will immediately trigger an asynchronous task responsible for removing all subsequent contributions of the former. On success, a GoodbyeMessageApplicationEvent is generated. Otherwise, on failure of the removal task, a GoodbyeMessageApplicationFailureEvent is generated. This lets ActivePivot distinguish the clean removal of a data node from one that is momentarily unresponsive.
  • When executing a query, intermediary results that have been computed and that are not required by any remaining computations are now discarded and are available for garbage collection.

Miscellaneous

  • The Azure Cloud Source has migrated its dependency to the Azure Blob Storage SDK from version 8 to version 12. This change came along a number of changes in the module's interface. More about this can be read on the migration notes.

5.9

New features

  • New Copper API: The Copper API, allowing you to easily define measures and hierarchies, has been overhauled. Documentation for the API is available here.

  • User Defined Aggregate Functions: These new functions can access several fields and store several intermediate values. They can be defined using the Copper API or by extending AUserDefinedAggregateFunction. The documentation is available here .

  • Option to conceal hierarchies and measures from the distributed application: A concealed hierarchy is only known in the data cube where it's located. The query node has no knowledge of this hierarchy. Concealing a hierarchy improves the stability of the cluster and shortens the discovery phase, by significantly reducing its impact on the network's bandwidth. We advise concealing hierarchies that have a very high cardinality. For more information, see the distributed documentation .

    Measures can also be concealed. A concealed measure is only known in the data cube where it's located. We recommend concealing measures that are not relevant to the query node, such as intermediate measures for calculations that do not need to be visible in the User Interface.

  • Option to disable the epoch level: The epoch level of the epoch dimension can now be enabled or disabled. By default, the epoch level is now disabled, only the branch level is enabled. To enable the epoch level, use IEpochDimensionDescription.setEpochLevelEnabled(boolean).

  • Export query plans: A new REST service allows you to perform a query and export its execution plan. The plan now contains the result of each part of the query for each Data Cube (if any), and each pass of the MDX query.

  • Limit the result size of a single GetAggregatesQuery (since 5.9.3): The result size for every retrievals as well as the transient result size cross retrievals within a given GetAggregatesQuery can be limited in terms of number of points i.e. locations. These limits can be set in the project configuration as a shared contex value in the cube description or dynamically at query time using QueriesResultLimit helper methods. If not specified, the default beahvior is without limit. Note that the transient result size also includes intermediate retrievals result size, often required to compute the top retrievals (Retrievals appearing in the query plan) in complexe queries.

Improvements

  • The CSV Source is now capable of detecting line ending at the byte level. This improves the parallelism and speed of the CSV Source when reading one large file, at the cost of not supporting a few uncommon character sets.

  • The performances of the MDX Engine has been improved by using fine-grained locking instead of coarse-grained locking.

  • Indexes are now more intensively used for Datastore queries.
    In previous versions, only two cases were covered:

    • an Index was used when the conditions on the index fields were set on a single value.
      Eg: Index on fields A, B, C was used for a condition And(a = 1, b = 2, c = 3)
    • an Index was used for a single condition with multiple values. Eg: Index on field F can be used for the condition b IN [1, 2, 3] For other cases, Indexes were ignored. For example, despite an Index on fields A and B, the query with conditions And(a IN [1, 2], b = 3) could not be planned using Index lookup. It could eventually rely on an Index lookup for A or B, provided such an Index existed, and field scans for the rest.

    The new logic adds the following support:

    • an Index on any number of fields can be used if only one field is subject to a condition with multiple values. Such an Index will be used by pre-compiled queries and ad-hoc queries.
      Eg: Index on fields A, B, C, D can be used for conditions And(a = 1, b = 2, c IN [3, 4], d = 5).
    • an Index on any number of fields can be used if no more than three of its fields are involved in conditions with multiple values. Such an Index can only be used by ad-hoc queries, for the query engine must be able to estimate the number of points generated by the conditions. Eg: Index on fields A, B, C, D can be used for conditions And(a IN [1, 2], b = 3, c = 4, d IN [5, 6, 7]).

    The maximum limit for cross joins are configured by the properties ActiveViamProperty#MAX_LOOKUPS_ON_PRIMARY_INDEX and ActiveViamProperty#MAX_LOOKUPS_ON_SECONDARY_INDEX.

Miscellaneous

  • Java 8 is no longer supported. ActivePivot 5.9 is compatible with Java 11.
  • The Sandbox has been converted to Spring Boot. Additionally, many features have been transformed into unit tests and are presented in the online documentation.

5.8

New features

  • JDK 11 support: we now provide two versions of ActivePivot 5.8: 5.8.x-jdk8 built with JDK8 and 5.8.x-jdk11 built with JDK11. If you plan to run your project with the latest JDK version, use 5.8.x-jdk11. For more information, see the dedicated page.

  • Cloud Csv Source: This source can fetch CSV files from major Cloud providers.

  • New Mode of our Continuous Query Engine: It is now capable of producing real-time notifications that a given query has changed, without computing the new view.

  • Centralization of ActivePivot Properties: The properties of ActivePivot are now centralized and documented in the ActiveViamProperty class. For more information, see the associated developer guide page.