> ## Documentation Index
> Fetch the complete documentation index at: https://docs.activeviam.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Database Queries

This guide shows how to build and use the database queries.

## Get-by-key queries

### Run a simple get-by-key query

The get-by-key query retrieves the record corresponding to a key.<br />
For example, given a table `trades` where rows are indexed by their trade id, the following retrieves the trader that
made the trade `1023`:

```java theme={"languages":{"custom":["/engine/python-sdk/0.9/languages/pycon.tmLanguage.json"]}}
final GetByKeyQuery query =
    databaseVersion
        .getQueryManager()
        .getByKeyQuery()
        .forTable("trades")
        .withTableFields("trader")
        .toQuery();
final IRecordReader recordReader =
    databaseVersion.getQueryRunner().getByKeyQuery(query).withKey(1023).run();
if (recordReader != null) {
  final String trader = (String) recordReader.read(0);
}
```

If the table has multiple key fields (e.g. `year`, `month`, and `day`), key values can be passed either as variadic
arguments or as a list:

```java theme={"languages":{"custom":["/engine/python-sdk/0.9/languages/pycon.tmLanguage.json"]}}
databaseVersion.getQueryRunner().getByKeyQuery(query).withKey(1998, 11, 21).run();
databaseVersion.getQueryRunner().getByKeyQuery(query).withKey(List.of(1998, 11, 21)).run();
```

The list of key fields for a given store is available via `IDataTable#getKeyFieldNames()`. Key values should be passed
to `withKey()` in this exact order.

### Compile once, run twice

If a query will be run several times, it is better to compile it once:

```java theme={"languages":{"custom":["/engine/python-sdk/0.9/languages/pycon.tmLanguage.json"]}}
final IPreparedGetByKeyQuery query =
    database
        .getQueryManager()
        .getByKeyQuery()
        .forTable("trader")
        .withTableFields("trader")
        .compile();
```

Then run it multiple times:

```java theme={"languages":{"custom":["/engine/python-sdk/0.9/languages/pycon.tmLanguage.json"]}}
for (int i = 0; i < 10; i++) {
  databaseVersion.getQueryRunner().getByKeyQuery(query).withKey(i).run();
}
```

### Use a get-by-key query

The get-by-key query returns the result as an `IRecordReader`.
For example, the following service retrieves the trader name for a trade id:

```java theme={"languages":{"custom":["/engine/python-sdk/0.9/languages/pycon.tmLanguage.json"]}}
private static class TraderNameService {
  IDatabase database;
  IPreparedGetByKeyQuery traderQuery;
  public TraderNameService(final IDatabase database) {
    this.database = database;
    this.traderQuery =
        database
            .getQueryManager()
            .getByKeyQuery()
            .forTable("trader")
            .withTableFields("trader")
            .compile();
  }
  public String getTraderName(final int tradeId) {
    final IRecordReader recordReader =
        database
            .getHead(IEpoch.MASTER_BRANCH_NAME)
            .getQueryRunner()
            .getByKeyQuery(this.traderQuery)
            .withKey(tradeId)
            .run();
    return (String) recordReader.read(0);
  }
}
```

### Get-by-key and DirectQuery

While Get-by-key queries performed on the datastore are very fast, they can take a long time on external databases,
solely due to the networking time. As running too many get-by-key queries through DirectQuery can be a performance
issue, there is a setting to control whether this is allowed. The possible behaviors are **denying**, **warning**, or
**allowing** the query to run, and the default is to warn.

For standard size tables, it is recommended to load this table into an in-memory datastore table, as mentioned in the
[composite database documentation](../../directquery/split-in-memory-directquery#composite-directquery-and-in-memory-database).
Get-by-key queries would then be directed to the in-memory table, instead of the external database.

Configure this behavior in the database settings when creating the application.

<Tabs>
  <Tab title="BigQuery">
    ```java theme={"languages":{"custom":["/engine/python-sdk/0.9/languages/pycon.tmLanguage.json"]}}
    final BigqueryDatabaseSettings databaseSettings =
        BigqueryDatabaseSettings.builder().getByKeyBehavior(GetByKeyBehavior.ALLOW).build();
    ```
  </Tab>

  <Tab title="ClickHouse">
    ```java theme={"languages":{"custom":["/engine/python-sdk/0.9/languages/pycon.tmLanguage.json"]}}
    final ClickhouseDatabaseSettings databaseSettings =
        ClickhouseDatabaseSettings.builder().getByKeyBehavior(GetByKeyBehavior.ALLOW).build();
    ```
  </Tab>

  <Tab title="Databricks">
    ```java theme={"languages":{"custom":["/engine/python-sdk/0.9/languages/pycon.tmLanguage.json"]}}
    final DatabricksDatabaseSettings databaseSettings =
        DatabricksDatabaseSettings.builder().getByKeyBehavior(GetByKeyBehavior.ALLOW).build();
    ```
  </Tab>

  <Tab title="Generic JDBC">
    ```java theme={"languages":{"custom":["/engine/python-sdk/0.9/languages/pycon.tmLanguage.json"]}}
    final GenericJdbcDatabaseSettings databaseSettings =
        GenericJdbcDatabaseSettings.builder().getByKeyBehavior(GetByKeyBehavior.ALLOW).build();
    ```
  </Tab>

  <Tab title="MS SQL">
    ```java theme={"languages":{"custom":["/engine/python-sdk/0.9/languages/pycon.tmLanguage.json"]}}
    final MsSqlDatabaseSettings databaseSettings =
        MsSqlDatabaseSettings.builder().getByKeyBehavior(GetByKeyBehavior.ALLOW).build();
    ```
  </Tab>

  <Tab title="Redshift">
    ```java theme={"languages":{"custom":["/engine/python-sdk/0.9/languages/pycon.tmLanguage.json"]}}
    final RedshiftDatabaseSettings databaseSettings =
        RedshiftDatabaseSettings.builder().getByKeyBehavior(GetByKeyBehavior.ALLOW).build();
    ```
  </Tab>

  <Tab title="Snowflake">
    ```java theme={"languages":{"custom":["/engine/python-sdk/0.9/languages/pycon.tmLanguage.json"]}}
    final SnowflakeDatabaseSettings databaseSettings =
        SnowflakeDatabaseSettings.builder().getByKeyBehavior(GetByKeyBehavior.ALLOW).build();
    ```
  </Tab>

  <Tab title="Synapse">
    ```java theme={"languages":{"custom":["/engine/python-sdk/0.9/languages/pycon.tmLanguage.json"]}}
    final SynapseDatabaseSettings databaseSettings =
        SynapseDatabaseSettings.builder().getByKeyBehavior(GetByKeyBehavior.ALLOW).build();
    ```
  </Tab>
</Tabs>

```java theme={"languages":{"custom":["/engine/python-sdk/0.9/languages/pycon.tmLanguage.json"]}}
final Application app =
    Application.builder(connector)
        .schema(schema)
        .managerDescription(managerDescription)
        .databaseSettings(databaseSettings)
        .build();
```

## List queries

### Minimal list query

The minimal list query requires a table name and retrieves all records:

```java theme={"languages":{"custom":["/engine/python-sdk/0.9/languages/pycon.tmLanguage.json"]}}
final IPreparedListQuery query =
    database
        .getQueryManager()
        .listQuery() // the builder guides you and impose the steps order
        .forTable("trades")
        .withoutCondition()
        .selectingAllTableFields()
        .compile();
```

To run the query, use the Query Runner of the database version:

```java theme={"languages":{"custom":["/engine/python-sdk/0.9/languages/pycon.tmLanguage.json"]}}
final ICursor result =
    databaseVersion.getQueryRunner().listQuery(preparedListQuery).withoutParameters().run();
```

### Add a condition

The list query can be filtered using a condition builder.

The following retrieves the trades made by John:

```java theme={"languages":{"custom":["/engine/python-sdk/0.9/languages/pycon.tmLanguage.json"]}}
final IPreparedListQuery preparedListQuery =
    database
        .getQueryManager()
        .listQuery() // the builder guides you and impose the steps order
        .forTable("trades")
        .withCondition(BaseConditions.equal(FieldPath.of("trader"), "John"))
        .selectingAllTableFields()
        .compile();
```

#### Dynamic conditions

The condition can be parameterized at query compilation. The parameter values are specified at query execution.

The following retrieves the trades made by Paul with that mechanism:

```java theme={"languages":{"custom":["/engine/python-sdk/0.9/languages/pycon.tmLanguage.json"]}}
final IPreparedListQuery preparedListQuery =
    database
        .getQueryManager()
        .listQuery() // the builder guides you and impose the steps order
        .forTable("trades")
        .withCondition(BaseConditions.equal(FieldPath.of("trader")).as("parameter"))
        .selectingAllTableFields()
        .compile();
final ICursor result =
    databaseVersion
        .getQueryRunner()
        .listQuery(preparedListQuery)
        .withParameters(Map.of("parameter", "Paul"))
        .run();
```

### Select fields

If not all fields are useful, the fields returned by the query can be selected.
There are three ways of doing so.

#### Aliased fields

This is the generic approach.
Fields returned by the query can be aliased, like in SQL queries:

```java theme={"languages":{"custom":["/engine/python-sdk/0.9/languages/pycon.tmLanguage.json"]}}
final IPreparedListQuery preparedListQuery =
    database
        .getQueryManager()
        .listQuery() // the builder guides you and impose the steps order
        .forTable("trades")
        .withCondition(BaseConditions.equal(FieldPath.of("trader"), "John"))
        .withAliasedFields(
            AliasedField.create("alias1", FieldPath.of("tradeId")),
            AliasedField.create("alias2", FieldPath.of("ToProduct", "currency")))
        .compile();
final int alias1Index =
    preparedListQuery.getQueryResultFormat().getPlainRecordFormat().getFieldIndex("alias1");
```

#### All fields belong to the queried table

If all the fields come from the queried table, there is a shorter method that will use the field name as an alias:

```java theme={"languages":{"custom":["/engine/python-sdk/0.9/languages/pycon.tmLanguage.json"]}}
.withAliasedFields(
    AliasedField.create("tradeId", FieldPath.of("tradeId")),
    AliasedField.create("date", FieldPath.of("date")))
```

This can be replaced by:

```java theme={"languages":{"custom":["/engine/python-sdk/0.9/languages/pycon.tmLanguage.json"]}}
.withTableFields("tradeId", "date")
```

#### Fields without an alias

If only the field position in the results matters, and not the alias, the following is also possible:

```java theme={"languages":{"custom":["/engine/python-sdk/0.9/languages/pycon.tmLanguage.json"]}}
final IPreparedListQuery preparedListQuery =
    database
        .getQueryManager()
        .listQuery() // the builder guides you and impose the steps order
        .forTable("trades")
        .withCondition(BaseConditions.equal(FieldPath.of("trader"), "John"))
        .withFieldsWithoutAlias(FieldPath.of("tradeId"), FieldPath.of("ToProduct", "currency"))
        .compile();
```

### Add a timeout or a result limit

For performance reasons, the query can be restricted by adding a result limit and/or a timeout.
The timeout will cancel the query if reached and an exception will be thrown.
The result limit sets the maximum number of rows that will be returned.
Note that the rows returned when using a result limit are not deterministic: successive executions of the same query may
return different rows, even if no transaction occurred in the database.

```java theme={"languages":{"custom":["/engine/python-sdk/0.9/languages/pycon.tmLanguage.json"]}}
final IPreparedListQuery preparedListQuery =
    database
        .getQueryManager()
        .listQuery() // the builder guides you and impose the steps order
        .forTable("trades")
        .withCondition(BaseConditions.equal(FieldPath.of("trader"), "John"))
        .withTableFields("tradeId", "date")
        .withResultsLimit(1000)
        .compile();
final ICursor result =
    databaseVersion
        .getQueryRunner()
        .listQuery(preparedListQuery)
        .withoutParameters()
        .withTimeout(Duration.ofSeconds(10L))
        .run();
```

### Add query tags

Each query can have a map of tags to help monitor the application.
For example, the following adds the name of the operation:

```java theme={"languages":{"custom":["/engine/python-sdk/0.9/languages/pycon.tmLanguage.json"]}}
final IPreparedListQuery preparedListQuery =
    database
        .getQueryManager()
        .listQuery() // the builder guides you and impose the steps order
        .forTable("trades")
        .withCondition(BaseConditions.equal(FieldPath.of("trader"), "Paul"))
        .withTableFields("tradeId", "date")
        .withQueryTags(Map.of("operation", "trade-inspection"))
        .compile();
final ICursor result =
    databaseVersion.getQueryRunner().listQuery(preparedListQuery).withoutParameters().run();
```

### Use a list query

The list query returns an `ICursor` that can be used to manipulate every record.
It is important to read through the `ICursor` within a try-with-resources statement to ensure that all related resources
are freed at the end of the manipulation.

For example, the following retrieves the sum of "value" by date:

```java theme={"languages":{"custom":["/engine/python-sdk/0.9/languages/pycon.tmLanguage.json"]}}
public Map<Object, Double> getValueByDates() {
  final IPreparedListQuery query =
      database
          .getQueryManager()
          .listQuery()
          .forTable("trades")
          .withCondition(BaseConditions.in(FieldPath.of("City")).as("City"))
          .withTableFields("date", "value")
          .compile();
  final ICursor result =
      databaseVersion
          .getQueryRunner()
          .listQuery(query)
          .withParameters(Map.of("City", Set.of("Paris", "London")))
          .run();
  final Map<Object, Double> valuesByDates = new HashMap<>();
  try (result) {
    for (final IRecordReader r : result) {
      final Object date = r.read(0);
      final double value = r.readDouble(1);
      valuesByDates.merge(date, value, Double::sum);
    }
  }
  return valuesByDates;
}
```

### Run with an acceptor

By default, calling `.run()` blocks the calling thread until the full result is available as a single `ICursor`.
An **acceptor** provides a way to process results as they are produced, partition by partition, without waiting for the
entire result set.

This is useful when:

* The result processing should happen in parallel.
* Fine-grained control over how partition results are merged or consumed is needed.

The acceptor implements `IPartitionedResultAcceptor`. Its `onResult(int partitionId, ICursor result)` method is called
once per partition as soon as that partition's results are ready.

<Warning>
  `onResult` may be called concurrently from different threads (one per partition). Implementations must be thread-safe.

  The `ICursor` passed to `onResult` is closed after the call returns. To retain records beyond the scope of `onResult`,
  materialize them using one of these methods:

  * `IRecordReader#copy()` or `IRecordReader#toTuple()` for individual records.
  * `ICursor#toRecords()` or `ICursor#toTuples()` for all records at once.
</Warning>

```java theme={"languages":{"custom":["/engine/python-sdk/0.9/languages/pycon.tmLanguage.json"]}}
final Map<Integer, List<Object[]>> resultsByPartition = new ConcurrentHashMap<>();
final AtomicReference<Throwable> error = new AtomicReference<>();
final ICompletionSync sync =
    databaseVersion
        .getQueryRunner()
        .listQuery(query)
        .withoutParameters()
        .withAcceptor(
            new IPartitionedResultAcceptor() {
              @Override
              public void onResult(final int partitionId, final ICursor result) {
                resultsByPartition.put(partitionId, result.toTuples());
              }
              @Override
              public void complete() {
                // Called once all partitions have been processed
              }
              @Override
              public void completeExceptionally(final Throwable ex) {
                error.compareAndSet(null, ex);
              }
            })
        .run();
sync.awaitCompletion(Duration.ofSeconds(10));
if (error.get() != null) {
  throw new ActiveViamRuntimeException("An error occurred in the acceptor.", error.get());
}
```

The `run()` method returns an `ICompletionSync` that can be used to wait for the query to finish.

## Distinct queries

The distinct query retrieves all distinct values for a field or a field combination from a record list.
The record list can be filtered with specified conditions.

Its characteristics are very similar to list query ones. It returns an `ICursor`, and provides the following options:

* Field aliasing
* Limit
* Timeout
* Query tags

This sample query extracts all the counterparties John has traded with:

```java theme={"languages":{"custom":["/engine/python-sdk/0.9/languages/pycon.tmLanguage.json"]}}
final IPreparedDistinctQuery preparedDistinctQuery =
    database
        .getQueryManager()
        .distinctQuery() // the builder guides you and impose the steps order
        .forTable("trades")
        .withCondition(BaseConditions.equal(FieldPath.of("trader"), "John"))
        .withTableFields("counterParty")
        .withResultsLimit(1000)
        .withQueryTags(Map.of("operation", "trade-inspection"))
        .compile();
final ICursor result =
    databaseVersion
        .getQueryRunner()
        .distinctQuery(preparedDistinctQuery)
        .withoutParameters()
        .withTimeout(Duration.ofSeconds(10L))
        .run();
```

### Note on joined projection fields

Projection fields belonging to a table other than the base table are joined with a left join.
This implies that default or null values can be present in the distinct result.
