Data Load Controller
This is the documentation for ActiveViam’s Data Load Controller library, with a section for users and a section for developers.
What Is The Data Load Controller?
The Data Load Controller (DLC) is a component within ActivePivot that sits on top of whatever target sources we may want to use (e.g. CSV source, JDBC source, custom sources), and allows loading and unloading of data for particular ‘source topics’ in a consolidated way.
From DLC’s perspective one just tells “I want to fetch data for topic A & B” (and specifying a particular scope of fetching, e.g. for a particular COB date) without caring about the underlying implementation details, such as which source is actually responsible for the fetching of topic A or topic B. Moreover, DLC is not only a local component but also exposed as a web-service by ActivePivot. This is quite fundamental, in fact, because it allows to keep “data orchestration” (loading/unloading) outside of ActivePivot. ActivePivot is a computational framework, and not a data orchestration framework.
So, an external data orchestrator component, such as another application or just a collection of Control-M jobs, etc., will be developed and maintained by customers outside ActivePivot, and ActivePivot simply gives ways for that orchestrator to tell it what to load/unload.
Supported Sources
The DLC supports the following data types and their corresponding data sources:
- CSV
- Local filesystem
- AWS S3
- Azure Blob
- Google Storage
- JDBC
- Any relational database that allows connectivity via jdbc driver
- Parquet
- Local parquet source.
- Messaging
- Kafka (Avro Binary and JSON messages)
- RabbitMQ (JSON messages)
The DLC is provided as a library that can be added as a maven dependency. It is
configured via Spring Configuration
as described in Configuring the DLC REST Service.
If you are looking for how to interact with DLC to load / unload data, start / stop messaging consumers or check the status of a request, please refer the User Guide.