Continuous GAQ REST API
What is the Continuous GAQ REST API?
The Continuous GAQ (GetAggregatesQuery) REST API is an experimental API that allows applications to run queries against the cube and receive continuous streaming updates as data changes. Results are delivered in Apache Arrow format.
This API is experimental and may change in future versions without notice.
Why use continuous queries?
Continuous queries provide:
- Real-time updates: Receive data changes as they occur without polling
- Incremental updates: Only affected cells are transmitted, not the entire result set
- Persistent connections: Maintain a single connection for multiple queries
Prerequisites
Before using this API:
- Authentication: All endpoints require authenticated users
- GAQ familiarity: Understanding of GetAggregatesQuery concepts (measures, levels, coordinates)
- Apache Arrow: Client-side Arrow deserialization capability
Overview
The Continuous GAQ REST API provides endpoints for creating publishers, subscribing to query streams, and managing continuous query subscriptions.
Key features
- Apache Arrow streaming: Results are streamed in
application/vnd.apache.arrow.streamformat - Publisher-subscriber pattern: Publishers group queries with shared lifecycle management
- User-based authentication: All operations require authenticated users
- Query lifecycle management: Create publisher, subscribe, unsubscribe, and stop publishers
Usage workflow
A typical workflow for using the Continuous GAQ REST API:
- Create a publisher using
/publisher/createto obtain a publisher ID - Subscribe to queries using
/subscribewith the publisher ID and query request - Receive streaming updates in Apache Arrow format as data changes
- Unsubscribe from specific queries when no longer needed using
/unsubscribe - Stop the publisher when all queries are complete using
/publisher/stop
API endpoints
All endpoints are available under the base path:
/activeviam/pivot/rest/v10/cube/{cubeName}/queries/continuous-gaq
For detailed endpoint specifications, request/response schemas, and examples, see the Continuous GAQ Query REST API in the OpenAPI documentation.
Create publisher
Creates a new publisher for continuous GAQ streams. Clients must use the returned publisher ID to subscribe to GAQ streams.
Endpoint: POST /publisher/create
Response: JSON containing the publisher ID
Subscribe to GAQ
Subscribes to a GAQ query and returns streaming results in Apache Arrow format.
The publisher ID identifies the publisher object where the streamed query results are written. Multiple queries can share one publisher. This groups related queries under the same lifecycle.
Endpoint: POST /subscribe
Response: Streaming HTTP response body in Apache Arrow format (application/vnd.apache.arrow.stream)
Batch Size Configuration: The number of rows per batch when streaming GAQ results can be configured using the JVM property activeviam.gaq.arrow.batchsize.
Unsubscribe from GAQ
Unsubscribes from a specific query on a publisher.
Endpoint: DELETE /unsubscribe
Stop publisher
Stops a publisher and unsubscribes all queries associated with it.
Endpoint: DELETE /publisher/stop
Authentication
All endpoints require authentication. The API uses the current authenticated user's credentials to execute queries and manage subscriptions. If the current thread is not authenticated, a ForbiddenAccessException is thrown.
Streaming result format
Results are streamed using the Apache Arrow IPC (Inter-Process Communication) format. The stream consists of multiple Arrow record batches, each representing either query results or failure notifications.
Apache Arrow streaming format
The API uses Apache Arrow's streaming format (application/vnd.apache.arrow.stream).
For more information about Apache Arrow, see the official documentation.
Stream structure
The continuous GAQ stream delivers results progressively as Arrow IPC messages. Each message contains:
Arrow record batch structure
Each Arrow IPC message contains a record batch with:
Schema metadata (custom metadata)
Metadata embedded in the Arrow schema provides context about the batch:
| Field | Description |
|---|---|
publisherId | Publisher identifier |
queryId | Unique query identifier |
branchId | Cube branch name |
epochId | Query execution epoch |
updateId | Update sequence number (0, 1, 2...) |
totalUpdates | Total number of updates in this batch |
cellSetType | ADDED / REMOVED / FULL_REFRESH / EMPTY |
chunkId | Chunk number within the update |
totalChunks | Total chunks for this update |
Cell set types:
ADDED: Cells added in this update (initial results or incremental additions)REMOVED: Cells removed in this updateFULL_REFRESH: Complete result set (replaces all previous results)EMPTY: Empty result set (no data)
Data columns
The record batch contains columns for:
- Level columns: Level members
- Measure columns: Aggregated values
Example Arrow Record Batch:
| Currency (UTF-8) | Value (Float64) |
|---|---|
| "USD" | 44.0 |
| "EUR" | 52.0 |
| "GBP" | 18.0 |
With Schema Metadata:
publisherId: "pub-1"
queryId: "query-123"
cellSetType: ADDED
updateId: 0
totalUpdates: 1
Reading results
To read the continuous Arrow stream:
- Connect to the streaming endpoint - Establish HTTP connection to
/subscribe - Read Arrow IPC messages sequentially - Each message contains one record batch
- Extract schema metadata - Read custom metadata from the schema to get:
publisherId: Publisher identifierqueryId: Query identifier for this specific queryepochId: Epoch number (increments with each update)cellSetType: Type of update (ADDED, REMOVED, FULL_REFRESH, EMPTY)updateIdandtotalUpdates: Position in multi-part updateschunkIdandtotalChunks: Position in chunked resultserror: Failure event information (if present)
- Check for failure events - If
errorkey exists in metadata, handle the failure - Extract data columns - Read level and measure vectors from the record batch
- Process based on cellSetType:
ADDED: Merge new cells into the result set or update existing cellsREMOVED: Remove cells from the result setFULL_REFRESH: Replace entire result set with new dataEMPTY: No data in this batch
- Reassemble chunked results - Combine chunks with same
updateIdusingchunkId - Continue reading - Process subsequent messages as they arrive
- Handle completion - Connection closes when publisher is stopped
For more information about Apache Arrow, see the official documentation.
Chunking for large results
When a single update contains many cells, it is split into multiple chunks based on the configured batch size (activeviam.gaq.arrow.batchsize JVM property):
Multiple update types in one result
A single continuous GAQ result can contain multiple update types (for example, both ADDED and REMOVED cells):
Failure event format
When a query execution fails, a failure event is sent instead of query results:
The failure event metadata contains:
errorClass: The type of error that occurredmessage: Error messagestackTrace: Full stack trace of the errorqueryId: The query that failedstreamId: The stream identifier
Example failure detection:
Schema Custom Metadata:
error: "FailureEvent{streamId=pub-123, queryId=query-456,
type=QueryExecutionException,
message=Query timeout exceeded}"
Complete example: data update flow
Step 1: Initial Query Result (subscription starts)
- Metadata:
queryId="q1",cellSetType=ADDED,epochId=1 - Data: USD=44.0, EUR=52.0
Step 2: Data Change Event (new USD trade worth 100.0 added)
- Metadata:
queryId="q1",cellSetType=ADDED,epochId=2 - Data: USD=144.0 (new aggregate value: 44+100)
- Note: Only the affected cell (USD) is sent, not the entire result set
Step 3: Multiple Update (both additions and removals)
- Batch 3a:
updateId=0,totalUpdates=2,cellSetType=REMOVED,epochId=3— EUR removed - Batch 3b:
updateId=1,totalUpdates=2,cellSetType=ADDED,epochId=3— GBP=30.0 added
Step 4: Full Refresh (complete result set replacement)
- Metadata:
queryId="q1",cellSetType=FULL_REFRESH,epochId=4 - Data: USD=200.0, GBP=45.0, JPY=88.0 (complete current state)
Connection limitations
When using HTTP/1.1, browsers and HTTP clients typically limit the number of simultaneous open connections to the same host. The standard limit is 6 concurrent connections per domain.
Since each active publisher maintains a persistent streaming connection, this means:
- Maximum 6 publishers can run simultaneously in a browser environment
- Once the limit is reached, additional subscription requests will be queued or blocked
- Stopping a publisher releases its connection, allowing new publishers to be created
Recommendations:
- Reuse publishers: Share publishers across related queries instead of creating multiple publishers
- Stop unused publishers: Call
/publisher/stopwhen queries are no longer needed - Use HTTP/2: HTTP/2 supports multiplexing, allowing many concurrent streams over a single connection