> ## Documentation Index
> Fetch the complete documentation index at: https://docs.activeviam.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Datastore Configuration

The datastore consists of a collection of *stores* that can be linked together through *references*. This page
introduces the concepts required when working with the datastore and walks through setting up a basic datastore
from a datastore description.

## Stores

A store is a collection of records that share the same attributes, called the *fields* (or *columns*) of the store.

In a standard database, a store is similar to a table, and records similar to rows.

### Store Description

All stores are built according to a store description, represented by the interface `IStoreDescription`.

The store name cannot be empty or contain the datastore schema separator, which is set by activeviam property
`activeviam.datastore.schema.separator`, and is equal to `/` by default.

The recommended way to build a store description is to use the builder interface returned by `StartBuilding.store()`.
The builder provides a fluent API to specify the field structure of a store.

Alternatively, `StoreDescription.simpleBuilder()` offers a non-staged builder, and the `StoreDescription.create(...)`
static factory takes the full list of parameters directly.

The following snippet defines the description of a *Product* store including information about products sold by a
retailer:

```java theme={"languages":{"custom":["/engine/python-sdk/0.9/languages/pycon.tmLanguage.json"]}}
final IStoreDescription productStoreDescription =
    StartBuilding.store()
        .withStoreName("Product")
        .withField("id", ILiteralType.LONG)
        .asKeyField()
        .withField("category", ILiteralType.STRING, "uncategorized")
        .withNullableField("name", ILiteralType.STRING)
        .withVectorField("priceHistory", ILiteralType.DOUBLE)
        .build();
```

### Field Specification

Store fields are specified by:

* a name, which cannot be empty or contain the datastore schema separator (see store name)
* a data type, from the values contained in the `ILiteralType` interface

Both scalar types and vector types are supported as field types.

```java theme={"languages":{"custom":["/engine/python-sdk/0.9/languages/pycon.tmLanguage.json"]}}
.withField("id", ILiteralType.LONG)
.withVectorField("priceHistory", ILiteralType.DOUBLE)
```

#### Default Values

A default value may be specified during field declaration. Otherwise, a default value is selected based on the data
type: `0` for numerical types and the string `"N/A"` for most object types.

A field may also be declared *nullable*, meaning that no default value is used for the field, accepting `null` as a
value instead.

```java theme={"languages":{"custom":["/engine/python-sdk/0.9/languages/pycon.tmLanguage.json"]}}
.withField("category", ILiteralType.STRING, "uncategorized")
.withNullableField("name", ILiteralType.STRING)
```

#### Key Fields

Any number of fields in a store can be flagged as *key fields* through:

```java theme={"languages":{"custom":["/engine/python-sdk/0.9/languages/pycon.tmLanguage.json"]}}
.withField("id", ILiteralType.LONG)
.asKeyField()
```

The set of key fields in a store should uniquely identify each record in the store. A key field can be made nullable.

A keyless store can be built using `.withoutKey()` after all fields have been specified, although such stores are
restricted regarding possible operations, due to the lack of key.

#### Dictionarization

A field can be marked as *dictionarized* by calling `.dictionarized()` right after declaring it. Distinct values are
then stored once in a dictionary, and each record only holds the integer position of its value in that dictionary.
Dictionarization typically reduces memory usage for low-cardinality fields (such as a currency or a category) and should
be avoided on high-cardinality fields (such as a unique identifier) where the dictionary itself can become larger than
the raw column.

Key fields and indexed fields are always dictionarized, so `.dictionarized()` only needs to be used on non-key,
non-indexed fields.

### Write Behaviors

Beyond the structure of a store, the description builder exposes a few options that affect how the store handles records
at write time.

#### Duplicate key handling

When two records with the same key are added during the same transaction, the store's *duplicate key handler* decides
what to do. It can be configured via `.withDuplicateKeyHandler(...)` in the builder, using a custom
`IDuplicateKeyHandler` or one of the handlers exposed by
`com.activeviam.database.datastore.api.description.impl.DuplicateKeyHandlers`:

* `ALWAYS_UPDATE`: keep the last received record (the default behavior).
* `LOG_WITHIN_TRANSACTION`: discard subsequent duplicates within the same transaction and log each occurrence.
* `THROW_WITHIN_TRANSACTION`: throw a `DuplicateKeyException` on subsequent duplicates within the same transaction.

#### Remove-unknown-key behavior

When a remove operation targets a key that does not exist in the store, the store's *remove unknown key listener* is
invoked. It can be configured via `.onRemovingUnknownKey()` on the builder, which exposes four options:

* `.throwException()`: throw an `UnknownKeyException` (the default behavior).
* `.logException()`: log the event and continue.
* `.keepSilent()`: silently ignore the event.
* `.callCustomListener(listener)`: invoke a custom `IRemoveUnknownKeyListener`.

#### Update only if different

By default, adding a record whose key already exists overwrites the previous record, even when the new record is
identical. Calling `.updateOnlyIfDifferent()` on the builder tells the store to skip the update when the incoming
record matches the stored one field-by-field, which can save significant work on idempotent feeds.

## References

The different stores within a datastore can be linked together through *references*.

A reference is a mapping from one store (the *owner store*) to another store (the *target store*), associating one or
more fields of the former to fields of the latter. Each reference defines a many-to-one relationship between two stores.
They are useful at query time to relate the information from different stores together.

A datastore generally has a multitude of references, forming a directed graph of stores.
The store graph cannot contain any cycle: the schema of stores can also be referred to as a star schema, as it has a
tree-like structure.

A store can reference another one multiple times, as long as the referenced fields are different for the multiple
references.

### Constraints

When a reference is created between two stores, a **uniqueness constraint** is added to the group of fields used by the
reference in the target store. As such, a line of the base store cannot point to multiple lines of the target store.
This is a consequence of the many-to-one relationship represented by the reference.

For example, given two stores A and B:

* **Store A**

  | TradeId | Value |
  | ------- | ----- |
  | Trade1  | 10    |
  | Trade2  | 20    |

* **Store B**

  | TradeId | LegType | Book  |
  | ------- | ------- | ----- |
  | Trade1  | N/A     | Book1 |
  | Trade2  | N/A     | Book2 |
  | Trade3  | Pay     | Book3 |
  | Trade3  | Recv    | Book3 |

If a reference is defined from Store A's *TradeId* field to Store B's *TradeId* field, a
`KeyConstraintViolationException` is raised when Store B is loaded, because multiple lines in Store B share the same
TradeId.

### Reference Description

Similarly to stores, datastore references are built according to a reference description, represented by the interface
`IReferenceDescription`.

The recommended way to build a reference description is through the fluent builder interface available through
`StartBuilding.reference()`.

A fully defined reference description is created like so:

```java theme={"languages":{"custom":["/engine/python-sdk/0.9/languages/pycon.tmLanguage.json"]}}
final IReferenceDescription referenceDescription =
    StartBuilding.reference()
        .fromStore("store A")
        .toStore("targetStore B")
        .withName("AToB")
        .withMapping("field from A", "field from B")
        .withMapping("another field from A", " another field from B")
        .build();
```

### Example

In this example, the required datastore consists of two stores:

* **Sale**: records correspond to a product sale.
* **PriceHistory**: records define the product prices at certain dates.

```java theme={"languages":{"custom":["/engine/python-sdk/0.9/languages/pycon.tmLanguage.json"]}}
final IStoreDescription saleStoreDescription =
    StartBuilding.store()
        .withStoreName("Sale")
        .withField("id", ILiteralType.LONG)
        .asKeyField()
        .withField("productId", ILiteralType.LONG)
        .asKeyField()
        .withField("date", ILiteralType.DATE)
        .asKeyField()
        .withField("quantity", ILiteralType.INT)
        .withField("unitPrice", ILiteralType.DOUBLE)
        .build();
final IStoreDescription priceHistoryStoreDescription =
    StartBuilding.store()
        .withStoreName("Product")
        .withField("id", ILiteralType.LONG)
        .asKeyField()
        .withField("name", ILiteralType.STRING)
        .withField("category", ILiteralType.STRING, "uncategorized")
        .build();
```

The following reference associates the product information from the *Product* store to the sales of the *Sale* store,
using the product ID as mapping.

```java theme={"languages":{"custom":["/engine/python-sdk/0.9/languages/pycon.tmLanguage.json"]}}
final IReferenceDescription referenceDescription =
    StartBuilding.reference()
        .fromStore("Sale")
        .toStore("Product")
        .withName("SaleToProduct")
        .withMapping("productId", "id")
        .build();
```

## Datastore Schema

A datastore is built according to a datastore schema, which gathers the store descriptions and the reference
descriptions that compose the datastore.

The recommended way to build a datastore schema is through the fluent builder interface available through
`StartBuilding.datastoreSchema()`:

```java theme={"languages":{"custom":["/engine/python-sdk/0.9/languages/pycon.tmLanguage.json"]}}
final IDatastoreSchemaDescription datastoreSchemaDescription =
    StartBuilding.datastoreSchema()
        .withStore(saleStoreDescription)
        .withStore(priceHistoryStoreDescription)
        .withReference(referenceDescription)
        .build();
```

The `.withStore()` and `.withReference()` methods can be chained as many times as needed.

## Creating a Datastore

Once the datastore schema has been created, the datastore can be built using `StartBuilding.application()`:

```java theme={"languages":{"custom":["/engine/python-sdk/0.9/languages/pycon.tmLanguage.json"]}}
final var application =
    StartBuilding.application()
        .withDatastore(datastoreSchemaDescription)
        .withoutBranchRestrictions()
        .build();
```

The `IDatastore` interface implements `AutoCloseable`, and can be torn down using the `close()` method.

## Querying a Datastore

Data can be retrieved directly from the datastore using [Database Queries](../../database/database_queries).
