Skip to main content

Continuous GAQ REST API

What is the Continuous GAQ REST API?

The Continuous GAQ (GetAggregatesQuery) REST API is an experimental API that allows applications to run queries against the cube and receive continuous streaming updates as data changes. Results are delivered in Apache Arrow format.

Experimental API

This API is experimental and may change in future versions without notice.

Why use continuous GAQ API?

The Continuous GAQ API provides:

  • Efficient data format: Results are delivered in Apache Arrow format for optimal performance
  • Direct query execution: Execute GetAggregatesQuery operations programmatically, then
  • Real-time updates: Receive data changes as they occur without polling
  • Incremental updates: Only affected cells are transmitted, not the entire result set (IN FUTURE VERSIONS)
  • Persistent connections: Maintain a single connection for multiple queries

Prerequisites

Before using this API:

  • Authentication: All endpoints require authenticated users
  • GAQ familiarity: Understanding of GetAggregatesQuery concepts (measures, levels, coordinates)
  • Apache Arrow: Client-side Arrow deserialization capability

Key features

  • Apache Arrow streaming: Results are streamed in application/vnd.apache.arrow.stream format
  • Publisher-subscriber pattern: Publishers group queries with shared lifecycle management
  • User-based authentication: All operations require authenticated users
  • Query lifecycle management: Create publisher, subscribe, unsubscribe, and stop publishers

Usage workflow

A typical workflow for using the Continuous GAQ REST API:

  1. Create a publisher using /publisher/create to obtain a publisher ID
  2. Subscribe to queries using /subscribe with the publisher ID and query request
  3. Receive streaming updates in Apache Arrow format as data changes
  4. Unsubscribe from specific queries when no longer needed using /unsubscribe
  5. Stop the publisher when all queries are complete using /publisher/stop

API endpoints

All endpoints are available under the base path:

/activeviam/pivot/rest/v10/cube/{cubeName}/queries/continuous-gaq

For detailed endpoint specifications, request/response schemas, and examples, see the Continuous GAQ Query REST API in the OpenAPI documentation.

Create publisher

Creates a new publisher for continuous GAQ streams. Clients must use the returned publisher ID to subscribe to GAQ streams.

Endpoint: POST /publisher/create

Response: JSON containing the publisher ID

Subscribe to GAQ

Subscribes to a GAQ query and returns streaming results in Apache Arrow format.

The publisher ID identifies the publisher object where the streamed query results are written. Multiple queries can share one publisher. This groups related queries under the same lifecycle.

Endpoint: POST /subscribe

Response: Streaming HTTP response body in Apache Arrow format (application/vnd.apache.arrow.stream)

Batch Size Configuration: The number of rows per batch when streaming GAQ results can be configured using the JVM property activeviam.gaq.arrow.batchsize.

Unsubscribe from GAQ

Unsubscribes from a specific query on a publisher.

Endpoint: DELETE /unsubscribe

Stop publisher

Stops a publisher and unsubscribes all queries associated with it.

Endpoint: DELETE /publisher/stop

Authentication

All endpoints require authentication. The API uses the current authenticated user's credentials to execute queries and manage subscriptions. If the current thread is not authenticated, a ForbiddenAccessException is thrown.

Streaming result format

Results are streamed using the Apache Arrow IPC (Inter-Process Communication) format. The stream consists of multiple Arrow record batches, each representing either query results or failure notifications.

Apache Arrow streaming format

The API uses Apache Arrow's streaming format (application/vnd.apache.arrow.stream).

For more information about Apache Arrow, see the official documentation.

Arrow record batch structure

The continuous GAQ stream delivers one Arrow IPC message per query update. The message may or may not be chunked using the Arrow standard RecordBatch. This should be seamless when reading using an Arrow compatible library.

Schema metadata (custom metadata)

Metadata embedded in the Arrow schema provides context about the batch:

FieldDescription
publisherIdPublisher identifier
queryIdUnique query identifier
branchIdCube branch name
epochIdQuery execution epoch
versionThe version of endpoint
fullRefreshAlways true in the current version of this API
error(Optional) Whether the update triggered an error, in which case the content of the message should be discarded

Data columns

The record batch contains columns for:

  • Level columns: Level members
  • Measure columns: Aggregated values

Example Arrow Record Batch:

Currency (UTF-8)Value (Float64)
"USD"44.0
"EUR"52.0
"GBP"18.0

Reading results

To read the continuous Arrow stream:

  1. Connect to the streaming endpoint - Establish HTTP connection to /subscribe
  2. Read Arrow IPC messages sequentially - Each message contains one data update
  3. Extract schema metadata - Read custom metadata from the schema to get:
    • publisherId: Publisher identifier
    • epochId: The epoch of the cube associated with the update
    • version: The version of the endpoint
    • queryId: Query identifier for this specific query
    • epochId: Epoch number (increments with each update)
    • fullRefresh: True
    • error: Failure event information (if present)
  4. Check for failure events - If error key exists in metadata, handle the failure
  5. Extract data columns - Read level and measure vectors from the record batch. In the current version of the API, fullRefresh will be true, and the entire view can be replaced with the one received.
  6. Continue reading - Process subsequent updates as they arrive
  7. Handle completion - Connection closes when publisher is stopped

For more information about Apache Arrow, see the official documentation.

Chunking for large results

When a single update contains many cells, it is split into multiple chunks based on the configured batch size (activeviam.gaq.arrow.batchsize JVM property). This uses Arrow's Record Batch and should be seamless to the end user.

Failure event format

When a query execution fails, a failure event is sent instead of query results. The associated record batch will be empty.

The failure event metadata contains:

  • errorClass: The type of error that occurred
  • message: Error message
  • stackTrace: Full stack trace of the error
  • queryId: The query that failed
  • streamId: The stream identifier

Example failure detection:

Schema Custom Metadata:
error: "FailureEvent{streamId=pub-123, queryId=query-456,
type=QueryExecutionException,
message=Query timeout exceeded}"

Connection limitations

When using HTTP/1.1, browsers and HTTP clients typically limit the number of simultaneous open connections to the same host. The standard limit is 6 concurrent connections per domain.

Since each active publisher maintains a persistent streaming connection, this means:

  • Maximum 6 publishers can run simultaneously in a browser environment
  • Once the limit is reached, additional subscription requests will be queued or blocked
  • Stopping a publisher releases its connection, allowing new publishers to be created

Recommendations:

  • Reuse publishers: Share publishers across related queries instead of creating multiple publishers
  • Stop unused publishers: Call /publisher/stop when queries are no longer needed
  • Use HTTP/2: HTTP/2 supports multiplexing, allowing many concurrent streams over a single connection