Documentation Index
Fetch the complete documentation index at: https://docs.activeviam.com/llms.txt
Use this file to discover all available pages before exploring further.
This documentation page assumes you are already familiar with the general structure of Cloud Sources in Atoti as well as with Azure Blob Storage.The Azure Cloud Source relies on Azure Blob Storage SDK 12 for Java. Make sure you are familiar with this SDK when using the Azure Cloud Source. In order to use the Azure Cloud Source, add the following lines to your
pom.xml:
Cloud Source to Azure Blob Storage concepts
Entities
The Azure implementation ofICloudEntity is AzureEntity. It is essentially a
wrapper around a blob client from the Azure Blob Storage SDK (e.g.
BlockBlobClient, AppendBlobClient, …). The Cloud Source bucket
nomenclature refers to the Azure Blob Container (and Blob Storage) containing
the referred blob.
Entity paths
ICloudEntityPath implementors for Azure Blob Storage all implement
IAzureEntityPath. They refer to a single blob in an Azure Blob Storage
account. There are five implementations for IAzureEntityPath that each
reference a blob client implementation from the Azure Blob Storage SDK:
| Cloud Source | Azure Blob Storage SDK |
|---|---|
AzureBlobPath | BlobClient |
AzureBlockBlobPath | BlockBlobClient |
AzureAppendBlobPath | AppendBlobClient |
AzurePageBlobPath | PageBlobClient |
AzureEncryptedBlobPath | EncryptedBlobClient |
BlobClient is a blob-type-agnostic blob client that can be used to read a
blob’s content without needing to know the blob type beforehand. Blobs created
using AzureBlobPath are created as block blobs by default (following the
behavior of BlobClient).
AzureEncryptedBlobPath is a specialized implementation that supports
client-side encryption. See the corresponding
section for more details.
Entity path limitations
The implementation of entity paths for Azure Blob Storage currently has the following limitations:- Uploading content of unknown length is not supported for
AzurePageBlobPath. This is due to a limitation with page blobs that requires the uploaded data’s size be a multiple of the internal page size on the storage (512 bytes). AzureEncryptedBlobPathonly supports the uploading of client-side-encrypted block blobs. This is a limitation of the Azure Blob Storage SDK (EncryptedBlobClienthas this same limitation). Downloading client-side-encrypted blobs of other types is supported.
Directories
The Azure implementation forICloudDirectory is represented by the
IAzureCloudDirectory interface. This interface provides additional methods to
explicitly request a blob client of a certain type:
IAzureCloudDirectory:
AzureCloudDirectory: base implementationAzureEncryptedCloudDirectory: can be provided with a key encryption key and/or a key encryption key resolver to respectively write or read blobs using client-side encryption (For more details, see the linked section.)
directory1/subdirectory2 would contain the first three of the following blobs:
BlobServiceClient and a container name,
or by directly supplying the appropriate BlobContainerClient.
Configuration example: how to configure it with a CSV source
First let’s define a generic CSV source configuration. This abstract configuration contains all the part that is common to all the sources and can be used to switch from local to cloud sources easily.Client-side encryption
The Azure Cloud Source provides specialized implementations ofICloudEntityPath and ICloudDirectory to support client-side
encryption.
Internally, the Azure Cloud Source uses the Azure Blob Storage Cryptography
module.
AzureEncryptedBlobPath is a wrapper around EncryptedBlobClient that supports
uploading and downloading blobs with client side encryption.
Understanding client-side encryption
When using client-side encryption, data is encrypted and decrypted on the client side, meaning that the data transiting on the network is always encrypted (on top of the HTTPS protocol, if used), using an encryption key that is only known by the client. When uploading data to a blob using client-side encryption, data is first encrypted using a one-time, symmetric encryption key (the content encryption key, or CEK), that is itself encrypted by the client using a key encryption key, or KEK (whose algorithm can be chosen, and can be either symmetric or asymmetric). The wrapped encryption key is sent and stored along with the encrypted data on the blob metadata. The key wrapping operation is performed by an object implementing theAsyncKeyEncryptionKey interface in the Azure
Blob Storage SDK. The client needs to associate a String id to the specified
key that will be stored along with the metadata. This enables the client to
distinguish between multiple keys when encrypting different blobs with different
keys.
As an example, two blobs uploaded to a storage account using client-side
encryption using two different KEKs would result in the following information
being store in the cloud:
AsyncKeyEncryptionKeyResolver interface.
The Azure Key Vault Key client module provides basicAsyncKeyEncryptionKeyandAsyncKeyEncryptionKeyResolverimplementations that can be created from keys stored on Azure Key Vault, or directly from ajava.security.KeyPairobject. See the classesKeyEncryptionKeyClientBuilderandLocalKeyEncryptionKeyClientBuilderin the aforementioned module. The module is not included as part of the Azure Cloud Source dependencies.
The symmetric encryption algorithm used by the Azure Blob Storage SDK to encrypt or decrypt content is AES with Cipher Block Chaining (CBC). For more details, see: Microsoft documentation.
Using client-side encryption
Client-side encryption in the Azure Cloud Source can be performed by using the dedicated specializationsAzureEncryptedCloudDirectory and
AzureEncryptedBlobPath.
Their constructors accept, as additional arguments compared to their regular
counterparts, the aforementioned
AsyncKeyEncryptionKey and AsyncKeyEncryptionKeyResolver, respectively, for
uploading or downloading encrypted content.
If the constructed object is used to either only perform uploading operations
or only perform downloading operations, the argument corresponding to the
unused operation can be set to null.
AzureEncryptedCloudDirectory
AzureEncryptedCloudDirectory behaves similarly to AzureCloudDirectory and is
able to access non-encrypted blobs in the same way. When attempting to download
a blob that was encrypted using client-side encryption, it will use the
supplied AsyncKeyEncryptionKeyResolver to decrypt the downloaded content.
When used to create a path to a non-existing blob, it will provide an
AzureEncryptedBlobPath, which means that the uploaded data will be encrypted
using the supplied AsyncKeyEncryptionKey.
AzureEncryptedBlobPath
AzureEncryptedBlobPath acts as a reference to an EncryptedBlobClient.
Much like the AzureEncryptedCloudDirectory, it is able to use the supplied
AsyncKeyEncryptionKeyResolver or AsyncKeyEncryptionKey to respectively
download or upload blobs with client-side encryption.
The Azure Blob Storage SDK only permits uploading data with client-side encryption for block blobs. As such,AzureEncryptedBlobPathhas the same restriction and is only able to upload block blobs. Downloading data from client-side-encrypted, page-and-append blobs (created through other means) is still possible through theAzureEncryptedBlobPath.