Continuous GAQ REST API

What is the Continuous GAQ REST API?

The Continuous GAQ (GetAggregatesQuery) REST API is an experimental API that allows applications to run queries against the cube and receive continuous streaming updates as data changes. Results are delivered in Apache Arrow format.

Experimental API

This API is experimental and may change in future versions without notice.

Why use continuous GAQ API?

The Continuous GAQ API provides:

Efficient data format: Results are delivered in Apache Arrow format for optimal performance
Direct query execution: Execute GetAggregatesQuery operations programmatically, then
Real-time updates: Receive data changes as they occur without polling
Incremental updates: Only affected cells are transmitted, not the entire result set (IN FUTURE VERSIONS)
Persistent connections: Maintain a single connection for multiple queries

Prerequisites

Before using this API:

Authentication: All endpoints require authenticated users
GAQ familiarity: Understanding of GetAggregatesQuery concepts (measures, levels, coordinates)
Apache Arrow: Client-side Arrow deserialization capability

Key features

Apache Arrow streaming: Results are streamed in application/vnd.apache.arrow.stream format
Publisher-subscriber pattern: Publishers group queries with shared lifecycle management
User-based authentication: All operations require authenticated users
Query lifecycle management: Create publisher, subscribe, unsubscribe, and stop publishers

Usage workflow

A typical workflow for using the Continuous GAQ REST API:

Create a publisher using /publisher/create to obtain a publisher ID
Subscribe to queries using /subscribe with the publisher ID and query request
Receive streaming updates in Apache Arrow format as data changes
Unsubscribe from specific queries when no longer needed using /unsubscribe
Stop the publisher when all queries are complete using /publisher/stop

API endpoints

All endpoints are available under the base path:

/activeviam/pivot/rest/v10/cube/{cubeName}/queries/continuous-gaq

For detailed endpoint specifications, request/response schemas, and examples, see the Continuous GAQ Query REST API in the OpenAPI documentation.

Create publisher

Creates a new publisher for continuous GAQ streams. Clients must use the returned publisher ID to subscribe to GAQ streams.

Endpoint: POST /publisher/create

Response: JSON containing the publisher ID

Subscribes to a GAQ query and returns streaming results in Apache Arrow format.

The publisher ID identifies the publisher object where the streamed query results are written. Multiple queries can share one publisher. This groups related queries under the same lifecycle.

Endpoint: POST /subscribe

Response: Streaming HTTP response body in Apache Arrow format (application/vnd.apache.arrow.stream)

Batch Size Configuration: The number of rows per batch when streaming GAQ results can be configured using the JVM property activeviam.gaq.arrow.batchsize.

Unsubscribe from GAQ

Unsubscribes from a specific query on a publisher.

Endpoint: DELETE /unsubscribe

Stop publisher

Stops a publisher and unsubscribes all queries associated with it.

Endpoint: DELETE /publisher/stop

Authentication

All endpoints require authentication. The API uses the current authenticated user's credentials to execute queries and manage subscriptions. If the current thread is not authenticated, a ForbiddenAccessException is thrown.

Streaming result format

Results are streamed using the Apache Arrow IPC (Inter-Process Communication) format. The stream consists of multiple Arrow record batches, each representing either query results or failure notifications.

Apache Arrow streaming format

The API uses Apache Arrow's streaming format (application/vnd.apache.arrow.stream).

For more information about Apache Arrow, see the official documentation.

Arrow record batch structure

The continuous GAQ stream delivers one Arrow IPC message per query update. The message may or may not be chunked using the Arrow standard RecordBatch. This should be seamless when reading using an Arrow compatible library.

Schema metadata (custom metadata)

Metadata embedded in the Arrow schema provides context about the batch:

Field	Description
`publisherId`	Publisher identifier
`queryId`	Unique query identifier
`branchId`	Cube branch name
`epochId`	Query execution epoch
`version`	The version of endpoint
`fullRefresh`	Always true in the current version of this API
`error`	(Optional) Whether the update triggered an error, in which case the content of the message should be discarded

Data columns

The record batch contains columns for:

Level columns: Level members
Measure columns: Aggregated values

Example Arrow Record Batch:

Currency (UTF-8)	Value (Float64)
"USD"	44.0
"EUR"	52.0
"GBP"	18.0

Reading results

To read the continuous Arrow stream:

Connect to the streaming endpoint - Establish HTTP connection to /subscribe
Read Arrow IPC messages sequentially - Each message contains one data update
Extract schema metadata - Read custom metadata from the schema to get:
- publisherId: Publisher identifier
- epochId: The epoch of the cube associated with the update
- version: The version of the endpoint
- queryId: Query identifier for this specific query
- epochId: Epoch number (increments with each update)
- fullRefresh: True
- error: Failure event information (if present)
Check for failure events - If error key exists in metadata, handle the failure
Extract data columns - Read level and measure vectors from the record batch. In the current version of the API, fullRefresh will be true, and the entire view can be replaced with the one received.
Continue reading - Process subsequent updates as they arrive
Handle completion - Connection closes when publisher is stopped

For more information about Apache Arrow, see the official documentation.

Chunking for large results

When a single update contains many cells, it is split into multiple chunks based on the configured batch size (activeviam.gaq.arrow.batchsize JVM property). This uses Arrow's Record Batch and should be seamless to the end user.

Failure event format

When a query execution fails, a failure event is sent instead of query results. The associated record batch will be empty.

The failure event metadata contains:

errorClass: The type of error that occurred
message: Error message
stackTrace: Full stack trace of the error
queryId: The query that failed
streamId: The stream identifier

Example failure detection:

Schema Custom Metadata:
  error: "FailureEvent{streamId=pub-123, queryId=query-456, 
          type=QueryExecutionException, 
          message=Query timeout exceeded}"

Connection limitations

When using HTTP/1.1, browsers and HTTP clients typically limit the number of simultaneous open connections to the same host. The standard limit is 6 concurrent connections per domain.

Since each active publisher maintains a persistent streaming connection, this means:

Maximum 6 publishers can run simultaneously in a browser environment
Once the limit is reached, additional subscription requests will be queued or blocked
Stopping a publisher releases its connection, allowing new publishers to be created

Recommendations:

Reuse publishers: Share publishers across related queries instead of creating multiple publishers
Stop unused publishers: Call /publisher/stop when queries are no longer needed
Use HTTP/2: HTTP/2 supports multiplexing, allowing many concurrent streams over a single connection

What is the Continuous GAQ REST API?​

Why use continuous GAQ API?​

Prerequisites​

Key features​

Usage workflow​

API endpoints​

Create publisher​

Subscribe to GAQ​

Unsubscribe from GAQ​

Stop publisher​

Authentication​

Streaming result format​

Apache Arrow streaming format​

Arrow record batch structure​

Schema metadata (custom metadata)​

Data columns​

Reading results​

Chunking for large results​

Failure event format​

Connection limitations​

Related documentation​