Skip to main content

Google Cloud Source

This documentation page assumes you are already familiar with the general structure of Cloud Sources in Atoti as well as with Google Cloud Storage.

The Google Cloud Source relies on Google Cloud Storage SDK for Java. Make sure you are familiar with this SDK when using the Google Cloud Source.

In order to use the Google Cloud Source, add the following lines to your pom.xml:

<dependency>
<groupId>com.activeviam.source</groupId>
<artifactId>cloud-source-google</artifactId>
<version>${atoti-server.version}</version>
</dependency>

Cloud Source to Google Cloud Storage concepts

Entities

The Google implementation of ICloudEntity is GoogleEntity. It is essentially a wrapper around an object from the Google Cloud Storage SDK.

Locating entities

Entity paths

IGoogleEntityPath implements ICloudEntityPath. It is a reference to an object and its metadata.

Directories

The Google Cloud Storage implementation for ICloudDirectory is represented by the GoogleCloudDirectory implementation.

A directory is tied to a bucket. It contains all objects whose names start with a certain prefix. For example, a directory on a certain container with the prefix directory1/subdirectory2 would contain the first three of the following objects:

inside:
directory1/subdirectory2/object1.txt
directory1/subdirectory2/object2.txt
directory1/subdirectory2/subdirectory3/object3.txt

not inside:
object4.txt
other_directory/object5.txt
directory1/object6.txt

A directory with an empty prefix corresponds to the root of the container.

A Google Cloud Storage directory object can be constructed by specifying the Storage client, a bucket name and a prefix.

The Storage client is the configuration of the connection to Google Cloud Storage. It can be configured as follows using the Google Cloud SDK.

final Storage storage =
StorageOptions.newBuilder().setProjectId("myProject").build().getService();

CSVDataProviderFactory

To configure the CSV source to read Google object, you can use the GoogleCsvDataProviderFactory class to configure how the files are downloaded.

Configuration example: how to configure it with a CSV source

First let's define a generic CSV source configuration. This abstract configuration contains all the part that is common to all the sources and can be used to switch from local to cloud sources easily.

public abstract class GenericCsvSourceConfig<I> {
protected abstract ICsvTopic<I> createTopic(
String topic, String fileName, ICsvParserConfiguration parserConfig);
protected abstract ICsvTopic<I> createDirectoryTopic(
String topic, String directory, ICsvParserConfiguration parserConfig);
/** Returns CSV source configured with two topics. */
public ICsvSource<I> csvSource() {
final ICsvSource<I> csvSource = CsvSourceFactory.create();
final ICsvParserConfiguration parserConfig = getParserConfiguration();
final ICsvTopic<I> productsTopic = createTopic("PRODUCTS_TOPIC", "products.csv", parserConfig);
csvSource.addTopic(productsTopic);
final ICsvTopic<I> desksTopic = createDirectoryTopic("DESKS_TOPIC", "desks", parserConfig);
csvSource.addTopic(desksTopic);
return csvSource;
}
}

Then let's define the Google specific configuration:

public class GoogleSourceConfiguration
extends GenericCsvSourceConfig<ICloudEntityPath<GoogleEntity>> {
@Override
protected ICsvTopic<ICloudEntityPath<GoogleEntity>> createTopic(
final String topic, final String fileName, final ICsvParserConfiguration parserConfig) {
return new CloudEntityCsvTopic<>(
topic, parserConfig, dataProviderFactory(), rootDirectory().getEntity(fileName));
}
@Override
protected ICsvTopic<ICloudEntityPath<GoogleEntity>> createDirectoryTopic(
final String topic, final String directory, final ICsvParserConfiguration parserConfig) {
return new CloudDirectoryCsvTopic<>(
topic,
parserConfig,
dataProviderFactory(),
rootDirectory().getSubDirectory(directory),
null);
}
public ICloudDirectory<GoogleEntity> rootDirectory() {
return new GoogleCloudDirectory(createClient(), "myBucket", "root");
}
public ICloudCsvDataProviderFactory<GoogleEntity> dataProviderFactory() {
return GoogleCsvDataProviderFactory.create(new CloudFetchingConfig(10));
}
}