Datastore Configuration
The datastore consists of a collection of stores that can be linked together through references. This page introduces the concepts required when working with the datastore and walks through setting up a basic datastore from a datastore description.
Stores
A store is a collection of records that share the same attributes, called the fields (or columns) of the store.
In a standard database, a store is similar to a table, and records similar to rows.
Store Description
All stores are built according to a store description, represented by the interface IStoreDescription.
The store name cannot be empty or contain the datastore schema separator, which is set by activeviam property
activeviam.datastore.schema.separator, and is equal to / by default.
The recommended way to build a store description is to use the builder interface returned by StartBuilding.store().
The builder provides a fluent API to specify the field structure of a store.
Alternatively, StoreDescription.simpleBuilder() offers a non-staged builder, and the StoreDescription.create(...)
static factory takes the full list of parameters directly.
The following snippet defines the description of a Product store including information about products sold by a retailer:
final IStoreDescription productStoreDescription =
StartBuilding.store()
.withStoreName("Product")
.withField("id", ILiteralType.LONG)
.asKeyField()
.withField("category", ILiteralType.STRING, "uncategorized")
.withNullableField("name", ILiteralType.STRING)
.withVectorField("priceHistory", ILiteralType.DOUBLE)
.build();
Field Specification
Store fields are specified by:
- a name, which cannot be empty or contain the datastore schema separator (see store name)
- a data type, from the values contained in the
ILiteralTypeinterface
Both scalar types and vector types are supported as field types.
.withField("id", ILiteralType.LONG)
.withVectorField("priceHistory", ILiteralType.DOUBLE)
Default Values
A default value may be specified during field declaration. Otherwise, a default value is selected based on the data
type: 0 for numerical types and the string "N/A" for most object types.
A field may also be declared nullable, meaning that no default value is used for the field, accepting null as a
value instead.
.withField("category", ILiteralType.STRING, "uncategorized")
.withNullableField("name", ILiteralType.STRING)
Key Fields
Any number of fields in a store can be flagged as key fields through:
.withField("id", ILiteralType.LONG)
.asKeyField()
The set of key fields in a store should uniquely identify each record in the store. A key field can be made nullable.
A keyless store can be built using .withoutKey() after all fields have been specified, although such stores are
restricted regarding possible operations, due to the lack of key.
Dictionarization
A field can be marked as dictionarized by calling .dictionarized() right after declaring it. Distinct values are
then stored once in a dictionary, and each record only holds the integer position of its value in that dictionary.
Dictionarization typically reduces memory usage for low-cardinality fields (such as a currency or a category) and should
be avoided on high-cardinality fields (such as a unique identifier) where the dictionary itself can become larger than
the raw column.
Key fields and indexed fields are always dictionarized, so .dictionarized() only needs to be used on non-key,
non-indexed fields.
Write Behaviors
Beyond the structure of a store, the description builder exposes a few options that affect how the store handles records at write time.
Duplicate key handling
When two records with the same key are added during the same transaction, the store's duplicate key handler decides
what to do. It can be configured via .withDuplicateKeyHandler(...) in the builder, using a custom
IDuplicateKeyHandler or one of the handlers exposed by
com.activeviam.database.datastore.api.description.impl.DuplicateKeyHandlers:
ALWAYS_UPDATE: keep the last received record (the default behavior).LOG_WITHIN_TRANSACTION: discard subsequent duplicates within the same transaction and log each occurrence.THROW_WITHIN_TRANSACTION: throw aDuplicateKeyExceptionon subsequent duplicates within the same transaction.
Remove-unknown-key behavior
When a remove operation targets a key that does not exist in the store, the store's remove unknown key listener is
invoked. It can be configured via .onRemovingUnknownKey() on the builder, which exposes four options:
.throwException(): throw anUnknownKeyException(the default behavior)..logException(): log the event and continue..keepSilent(): silently ignore the event..callCustomListener(listener): invoke a customIRemoveUnknownKeyListener.
Update only if different
By default, adding a record whose key already exists overwrites the previous record, even when the new record is
identical. Calling .updateOnlyIfDifferent() on the builder tells the store to skip the update when the incoming
record matches the stored one field-by-field, which can save significant work on idempotent feeds.
References
The different stores within a datastore can be linked together through references.
A reference is a mapping from one store (the owner store) to another store (the target store), associating one or more fields of the former to fields of the latter. Each reference defines a many-to-one relationship between two stores. They are useful at query time to relate the information from different stores together.
A datastore generally has a multitude of references, forming a directed graph of stores. The store graph cannot contain any cycle: the schema of stores can also be referred to as a star schema, as it has a tree-like structure.
A store can reference another one multiple times, as long as the referenced fields are different for the multiple references.
Constraints
When a reference is created between two stores, a uniqueness constraint is added to the group of fields used by the reference in the target store. As such, a line of the base store cannot point to multiple lines of the target store. This is a consequence of the many-to-one relationship represented by the reference.
For example, given two stores A and B:
-
Store A
TradeId Value Trade1 10 Trade2 20 -
Store B
TradeId LegType Book Trade1 N/A Book1 Trade2 N/A Book2 Trade3 Pay Book3 Trade3 Recv Book3
If a reference is defined from Store A's TradeId field to Store B's TradeId field, a
KeyConstraintViolationException is raised when Store B is loaded, because multiple lines in Store B share the same
TradeId.
Reference Description
Similarly to stores, datastore references are built according to a reference description, represented by the interface
IReferenceDescription.
The recommended way to build a reference description is through the fluent builder interface available through
StartBuilding.reference().
A fully defined reference description is created like so:
final IReferenceDescription referenceDescription =
StartBuilding.reference()
.fromStore("store A")
.toStore("targetStore B")
.withName("AToB")
.withMapping("field from A", "field from B")
.withMapping("another field from A", " another field from B")
.build();
Example
In this example, the required datastore consists of two stores:
- Sale: records correspond to a product sale.
- PriceHistory: records define the product prices at certain dates.
final IStoreDescription saleStoreDescription =
StartBuilding.store()
.withStoreName("Sale")
.withField("id", ILiteralType.LONG)
.asKeyField()
.withField("productId", ILiteralType.LONG)
.asKeyField()
.withField("date", ILiteralType.DATE)
.asKeyField()
.withField("quantity", ILiteralType.INT)
.withField("unitPrice", ILiteralType.DOUBLE)
.build();
final IStoreDescription priceHistoryStoreDescription =
StartBuilding.store()
.withStoreName("Product")
.withField("id", ILiteralType.LONG)
.asKeyField()
.withField("name", ILiteralType.STRING)
.withField("category", ILiteralType.STRING, "uncategorized")
.build();
The following reference associates the product information from the Product store to the sales of the Sale store, using the product ID as mapping.
final IReferenceDescription referenceDescription =
StartBuilding.reference()
.fromStore("Sale")
.toStore("Product")
.withName("SaleToProduct")
.withMapping("productId", "id")
.build();
Datastore Schema
A datastore is built according to a datastore schema, which gathers the store descriptions and the reference descriptions that compose the datastore.
The recommended way to build a datastore schema is through the fluent builder interface available through
StartBuilding.datastoreSchema():
final IDatastoreSchemaDescription datastoreSchemaDescription =
StartBuilding.datastoreSchema()
.withStore(saleStoreDescription)
.withStore(priceHistoryStoreDescription)
.withReference(referenceDescription)
.build();
The .withStore() and .withReference() methods can be chained as many times as needed.
Creating a Datastore
Once the datastore schema has been created, the datastore can be built using StartBuilding.application():
final var application =
StartBuilding.application()
.withDatastore(datastoreSchemaDescription)
.withoutBranchRestrictions()
.build();
The IDatastore interface implements AutoCloseable, and can be torn down using the close() method.
Querying a Datastore
Data can be retrieved directly from the datastore using Database Queries.