Skip to main content

Troubleshooting "My partitions are empty, why are they not dropped ?"

Introduction

Deleting data is often part of the application tasks that are done before loading fresher data.

When deleting data, ActivePivot can empty or drop partitions. Emptying partitions means deleting the records and keeping the indexes and references dictionaries for future usage of the partitions. Drop partitions means deleting the records, the indexes and the references dictionaries. Partitions are dropped only if the deletion condition is done on fields that are value-partitioned.

Here are some information to help design the best deletion procedure.

printStoresSizes method

The printStoresSizes function prints the list of the existing partitions and their sizes. Only existing partitions appear in the list. Partitions in the list with a size equal to zero indicate there are no records but the partition has not been dropped and can retain memory ( like references dictionaries).

The following example shows the Base store partitions are all empty but have not been deleted.

+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

| Store lengths at epoch 58 |

+-------------------+--------------------+--------------------+---------------+--------------------------------------------------+--------------------------------------------------+

| Store name | Size | Max row id | Partitions | Partition sizes | Partition max row ids |

+-------------------+--------------------+--------------------+---------------+--------------------------------------------------+--------------------------------------------------+

| Base | 0 | 29000000 | 16 | 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 | list of partitions max size |

+-------------------+--------------------+--------------------+---------------+--------------------------------------------------+--------------------------------------------------+

The following example shows that the partitions of the Base store are all empty except the last one.

+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

| Store lengths at epoch 58 |

+-------------------+--------------------+--------------------+---------------+--------------------------------------------------+--------------------------------------------------+

| Store name | Size | Max row id | Partitions | Partition sizes | Partition max row ids |

+-------------------+--------------------+--------------------+---------------+--------------------------------------------------+--------------------------------------------------+

| Base | 0 | 29000000 | 16 | 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 100 | list of partitions max size |

+-------------------+--------------------+--------------------+---------------+--------------------------------------------------+--------------------------------------------------+

The example below shows the City partitions were never created or have been deleted.

|                                                                              Store lengths at epoch 3                                                                              |

+--------------------+--------------------+--------------------+---------------+--------------------------------------------------+--------------------------------------------------+

| Store name | Size | Max row id | Partitions | Partition sizes | Partition max row ids |

+--------------------+--------------------+--------------------+---------------+--------------------------------------------------+--------------------------------------------------+

| City | 0 | 0 | 0 | | |

+--------------------+--------------------+--------------------+---------------+--------------------------------------------------+--------------------------------------------------+

Partitions deletions

Partitions can be dropped automatically when doing a removeWhere operation. Partitions are dropped if and only if the removeWhere condition is done on fields that are value-partitioned.

The main reason to have a partition empty but not deleted is because the store is referenced by another store. In that case, the indexes and references dictionaries are kept too .

For example, let's take a datastore with two stores, A and B with a reference from A to B. A is value partitioned by F1 and hash partitioned by F2. B is value partitioned by F1. In the initial state, there are four partitions in the store A and two partitions in the store B as below. Partitions initial state A removeWhere on the store A on F1 = 0 will delete the partitions 0 and 1 of store A. Partitions removeWhere on store A A removeWhere on the store B on F1 = 0 will empty the partition 0 of store B but not delete it because of the reference from the partitions 0 and 1 of the store A. Partitions removeWhere on store A A removeWhere on the stores A and B on F1 = 0 will delete the partitions 0 and 1 of the store A and the partition 0 of store B. Partitions removeWhere on stores A and B

Deleting a partition clears the cache of the indexes and dictionaries, but does not delete them.

The dictionaries are kept to be able to recreate a dropped partition.

If a store is emptied (but its partitions are not deleted) and then refilled with the same data, the indexes and partitions will be reused.

If a store is emptied (but its partitions are not deleted) and then refilled with the different data, the indexes and partitions will be created alongside the existing ones.

Optimizing data deletions

When deleting data (or loading new data), the references between the stores are updated if the owner store is updated. So when deleting data on different stores, it is best to start with the owner store of the reference not to update the references for nothing and to ease the deletion of partitions. If the deletion is done periodically, it should target entire partitions rather than deleting a few rows across several partitions. If the deletion transaction is quite big, it is also possible to make one transaction per store or even per partition.