Snowflake Database

This section explains how to connect to a Snowflake Database.

Overall Sequence

To connect to a remote Database using DirectQuery you need to complete these steps:

  1. Configure your remote Database in FRTB's expected Database format
  2. Set FRTB configuration properties
  3. Deploy FRTB in Horizontal Distribution
  4. Database Schema

    Your remote Database must be configured in the same format as FRTB's in-memory datastores. This means the same Tables and Field Types will need to be replicated. You can do this by defining the Tables as outlined in the Database documentation or by exposing Views on your Database in the same expected Database format.

    note

    All of the Database Tables must be present either as actual Tables or Views on your connected Database.

    Required DirectQuery Properties

    The following properties must be configured to get started with DirectQuery.

    Maven Profile

    The frtb-directquery module is not included in the classpath by default. You will need to enable the direct-query Maven Profile to add the frtb-directquery module into the classpath. This will need to be completed when building the JAR file.

    Disable IMA

    For this preview implementation, only SA is supported when running with DirectQuery. To disable IMA, add the following property either to the frtb.properties file or as a command-line argument with the prefix -D as shown:
    frtb.properties:
    ima.disable=true
    
    Command-line argument:
    -Dima.disable=true
    

    Snowflake Configuration

    The application.yaml file contains specific properties for enabling and connecting to a remote Database.
    # Snowflake Database Connection Parameters
    directquery:
      enabled: false
      database:
        type: snowflake
        snowflake:
          connectionString: YOUR_JDBC_SNOWFLAKE_CONNECTION_STRING
          username: YOUR_USERNAME
          password: YOUR_PASSWORD
          warehouse: YOUR_WAREHOUSE
          database: YOUR_DATABASE
          schema: YOUR_SCHEMA
    

    The above properties populate the SnowflakeSpringProperties java class.

    connectionString

    Full property name: directquery.database.snowflake.connectionString

    This is the JDBC Driver connection string which points to the Snowflake Database. for details on how to find your JDBC Connection String, see the JDBC Driver Snowflake documentation.

    warehouse

    Full property name: directquery.database.snowflake.warehouse

    The warehouse is where our queries and aggregation will take place. You can view available warehouses through the SHOW WAREHOUSES command.

    database

    Full property name: directquery.database.snowflake.database

    The Snowflake Database we will be connecting to.

    schema

    Full property name: directquery.database.snowflake.schema

    The Schema within the specified Database to use. This Schema should match FRTB’s expected Database Structure either by having the same Table structure or by through the use of views.

    Deployment Options

    We provide two main options to run with DirectQuery:

    Option How to run FRTB
    Operate with some data loaded in-memory and the rest available through DirectQuery In Horizontal Distribution with in-memory Database
    Run purely on DirectQuery remote data In a Single JVM on DirectQuery only

    Horizontal Distribution

    In Horizontal Distribution you have access to the in-memory tools such as What-If, data updates and Sign-Off for the data loaded in-memory as well as access to a large number of historical dates.

    When running in Horizontal Distribution you need to run two JVMs:

    • One JVM which consists of a query node and in-memory data node
    • Another JVM which consists of a DirectQuery data node

    note

    When running in horizontal distribution, the StandardisedApproachCube will only contain data from the in-memory data node. To view the DirectQuery data, you will have to use the CombinedCube. This is because the JVM with DirectQuery data will get attached to the query node under the CombinedCube and thus the DirectQuery data is only visible in the CombinedCube.

    initial-load.business-dates

    For both Combined in-memory query node and DirectQuery data node, set the initial-load.business-dates property located inside frtb.properties.

    Given that the data nodes are distributed by AsOfDate, no two data nodes can contain the same Partition - meaning that one AsOfDate cannot be present in another node. So to prevent any issues the initial-load.business-dates property is used to set which dates to load into the in-memory data node, while also specifying to the DirectQuery data node which AsOfDates to exclude.

    note

    This property must be provided with the same values to both data nodes.

    JVM With Query Node And In-Memory Data Node

    This JVM consists of two distributed nodes; a query node and an in-memory data node.

    Start the JVM by specifying the following parameters to the FRTB application, either in a .properties file or through command-line arguments (add -D before each property).

    # Disable IMA
    ima.disable=true
    
    # Set FRTB to run this JVM with a query node and in-memory data node
    spring.profiles.active=forceNetty
    activeviam.distribution.gossip.router.enable=true
    activeviam.distribution.gossip.router.port=16484
    

    JVM With DirectQuery Data Node

    Once the JVM with the query and in-memory data node is running, you can start a second JVM with the DirectQuery data node with the following configuration properties either in a .properties file or through command-line arguments (add -D before each property).

    # Set FRTB to run as a data node and to connect to the second JVM
    spring.profiles.active=dist-data-node
    
    # Use a different port from the second JVM
    server.port=8081
    
    # Use a different Content Server for this JVM
    content-service.db.url=jdbc:h2:mem:content_service;DB_CLOSE_DELAY=-1
    content-service.db.hibernate.hbm2ddl.auto=create
    
    # Disable IMA in order to use DirectQuery
    ima.disable=true
    
    # Enable and configure the Snowflake DirectQuery Database
    directquery.enabled=true
    directquery.database.type=database_type
    directquery.database.snowflake.username=database_username
    directquery.database.snowflake.password=database_password
    directquery.database.snowflake.connectionString=connection_string
    directquery.database.snowflake.warehouse=snowflake_warehouse
    directquery.database.snowflake.database=snowflake_database
    directquery.database.snowflake.schema=snowflake_database_schema
    

    Single JVM

    You can run a single JVM consisting of only DirectQuery.

    By running in a single JVM with DirectQuery only, you can now see the DirectQuery data in the StandardisedApproachCube.

    Properties

    To run the Single JVM node we will only need to configure the Snowflake Properties and Disable IMA and run the application as normal. There are no distributed nodes to configure when running in a single JVM.

    Here is an example of the properties to use:

    # Disable IMA in order to use DirectQuery
    ima.disable=true
    
    # Enable and configure the Snowflake DirectQuery Database
    directquery.enabled=true
    directquery.database.type=database_type
    directquery.database.snowflake.username=database_username
    directquery.database.snowflake.password=database_password
    directquery.database.snowflake.connectionString=connection_string
    directquery.database.snowflake.warehouse=snowflake_warehouse
    directquery.database.snowflake.database=snowflake_database
    directquery.database.snowflake.schema=snowflake_database_schema
    

    Single JVM Limitations

    By running with a single JVM we will be running purely on DirectQuery. This will come at the compromise of query performance of Trade level queries while also not having access to in-memory only tools such as WhatIf and SignOff.

    note

    Running both an in-memory data node and DirectQuery data node under a single JVM is not currently supported.