> ## Documentation Index
> Fetch the complete documentation index at: https://docs.activeviam.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Google Cloud Source

> This documentation page assumes you are already familiar with the general
> structure of [Cloud Sources in Atoti](cloud_source) as well as
> with [Google Cloud Storage](https://cloud.google.com/storage).

The Google Cloud Source relies on [Google Cloud Storage SDK](https://github.com/googleapis/google-cloud-java) for Java.
Make sure you are familiar with this SDK when using the Google Cloud Source.

In order to use the Google Cloud Source, add the following lines to your `pom.xml`:

```xml theme={"languages":{"custom":["/engine/python-sdk/0.9/languages/pycon.tmLanguage.json"]}}
<dependency>
    <groupId>com.activeviam.source</groupId>
    <artifactId>cloud-source-google</artifactId>
    <version>${atoti-server.version}</version>
</dependency>
```

## Cloud Source to Google Cloud Storage concepts

### Entities

The Google implementation of `ICloudEntity` is `GoogleEntity`.
It is essentially a wrapper around an *object* from the Google Cloud Storage SDK.

### Locating entities

#### Entity paths

`IGoogleEntityPath` implements `ICloudEntityPath`.
It is a reference to an object and its metadata.

#### Directories

The Google Cloud Storage implementation for `ICloudDirectory` is represented by the `GoogleCloudDirectory` implementation.

A directory is tied to a bucket.
It contains all objects whose names start with a certain prefix.
For example, a directory on a certain container with the prefix
`directory1/subdirectory2` would contain the first three of the following objects:

```yaml theme={"languages":{"custom":["/engine/python-sdk/0.9/languages/pycon.tmLanguage.json"]}}
inside:
  directory1/subdirectory2/object1.txt
  directory1/subdirectory2/object2.txt
  directory1/subdirectory2/subdirectory3/object3.txt

not inside:
  object4.txt
  other_directory/object5.txt
  directory1/object6.txt
```

A directory with an empty prefix corresponds to the root of the container.

A Google Cloud Storage directory object can be constructed by specifying the `Storage` client, a bucket name and a prefix.

The `Storage` client is the configuration of the connection to Google Cloud Storage.
It can be configured as follows using the Google Cloud SDK.

```java theme={"languages":{"custom":["/engine/python-sdk/0.9/languages/pycon.tmLanguage.json"]}}
final Storage storage =
    StorageOptions.newBuilder().setProjectId("myProject").build().getService();
```

## CSVDataProviderFactory

To configure the CSV source to read Google object,
you can use the `GoogleCsvDataProviderFactory` class to configure how the files are downloaded.

## Configuration example: how to configure it with a CSV source

First let's define a generic CSV source configuration.
This abstract configuration contains all the part that is common to all the sources and can be used to switch from local to cloud sources easily.

```java theme={"languages":{"custom":["/engine/python-sdk/0.9/languages/pycon.tmLanguage.json"]}}
public abstract class GenericCsvSourceConfig<I> {
  protected abstract ICsvTopic<I> createTopic(
      String topic, String fileName, CsvParserConfiguration parserConfig);
  protected abstract ICsvTopic<I> createDirectoryTopic(
      String topic, String directory, CsvParserConfiguration parserConfig);
  /** Returns CSV source configured with two topics. */
  public ICsvSource<I> csvSource() {
    final ICsvSource<I> csvSource = ICsvSource.<I>builder().build();
    final CsvParserConfiguration parserConfig = getParserConfiguration();
    final ICsvTopic<I> productsTopic = createTopic("PRODUCTS_TOPIC", "products.csv", parserConfig);
    csvSource.addTopic(productsTopic);
    final ICsvTopic<I> desksTopic = createDirectoryTopic("DESKS_TOPIC", "desks", parserConfig);
    csvSource.addTopic(desksTopic);
    return csvSource;
  }
}
```

Then let's define the Google specific configuration:

```java theme={"languages":{"custom":["/engine/python-sdk/0.9/languages/pycon.tmLanguage.json"]}}
public class GoogleSourceConfiguration
    extends GenericCsvSourceConfig<ICloudEntityPath<GoogleEntity>> {
  @Override
  protected ICsvTopic<ICloudEntityPath<GoogleEntity>> createTopic(
      final String topic, final String fileName, final CsvParserConfiguration parserConfig) {
    return new CloudEntityCsvTopic<>(
        topic, parserConfig, dataProviderFactory(), rootDirectory().getEntity(fileName));
  }
  @Override
  protected ICsvTopic<ICloudEntityPath<GoogleEntity>> createDirectoryTopic(
      final String topic, final String directory, final CsvParserConfiguration parserConfig) {
    return new CloudDirectoryCsvTopic<>(
        topic,
        parserConfig,
        dataProviderFactory(),
        rootDirectory().getSubDirectory(directory),
        null);
  }
  public ICloudDirectory<GoogleEntity> rootDirectory() {
    return new GoogleCloudDirectory(createClient(), "myBucket", "root");
  }
  public ICloudCsvDataProviderFactory<GoogleEntity> dataProviderFactory() {
    return GoogleCsvDataProviderFactory.create(
        CloudFetchingConfig.builder().downloadThreadCount(10).build());
  }
}
```
