Class EnhancedGenericDatumReader
- java.lang.Object
-
- org.apache.avro.generic.GenericDatumReader<org.apache.avro.generic.GenericRecord>
-
- com.activeviam.io.data.source.EnhancedGenericDatumReader
-
- All Implemented Interfaces:
org.apache.avro.io.DatumReader<org.apache.avro.generic.GenericRecord>
- Direct Known Subclasses:
ArrayAsTroveListGenericDatumReader,MutableArrayElementGenericDatumReader
public class EnhancedGenericDatumReader extends org.apache.avro.generic.GenericDatumReader<org.apache.avro.generic.GenericRecord>Extend Avro core implementation of generic datum reader with leverage ofEnhancedGenericData(by default), and optional collection/logging of detailed statistics (read objects count) about reader activity.This implementation also offers easy customization hooks for subclasses that wish to implement more efficient (typically, memory-wise) reading of scalar of array values. See
doReadWithoutConversion(Object, Schema, ResolvingDecoder)anddoReadArray(Object, Schema, ResolvingDecoder). Some subclassing implementation may also appreciate the extra reader state information aboutcurrentRecordSchemaandcurrentField.As with the Avro core philosophy, one may also inject into this reader a custom/extended implementation of
GenericDatain order to provide extra benefits for the reading process w/o directly sub-classing this implementation. Refer to alternate constructors that allow for injection of aGenericDatainstance. Note that by default (if not explicitly injecting anything) this reader implementation will already leverage andenhancedimplementation of generic data. If (for some strange reason) one wish to instead still use the core Avro implementation, one may use an alternative constructor to inject it.Thread Safety - Each reader instance must be used by one thread at a time to
readsomething. However, each reader instance may be used by different threads over time (sequentially). Moreover, this class implements a thread-safe collection of aggregated statistics across multiple reader instances being used concurrently by multiple threads over time.
-
-
Field Summary
Fields Modifier and Type Field Description protected org.apache.avro.Schema.FieldcurrentFieldprotected org.apache.avro.SchemacurrentRecordSchemaprotected static com.activeviam.io.data.source.EnhancedGenericDatumReader.Statisticsstatsprotected booleanwithDetailedMonitoring
-
Constructor Summary
Constructors Constructor Description EnhancedGenericDatumReader(org.apache.avro.Schema schema)Create an enhanced generic datum reader with detailed monitoring disabled.EnhancedGenericDatumReader(org.apache.avro.Schema schema, boolean withDetailedMonitoring)Create an enhanced generic datum reader with optional enabling of detailed monitoring.EnhancedGenericDatumReader(org.apache.avro.Schema schema, org.apache.avro.generic.GenericData data)Create an enhanced generic datum reader with explicit specification of theGenericDatainstance to leverage, and with detailed monitoring disabled.EnhancedGenericDatumReader(org.apache.avro.Schema schema, org.apache.avro.generic.GenericData data, boolean withDetailedMonitoring)Create an enhanced generic datum reader with explicit specification of theGenericDatainstance to leverage, and with optional enabling of detailed monitoring.
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description protected ObjectdoReadArray(Object old, org.apache.avro.Schema expected, org.apache.avro.io.ResolvingDecoder in)protected ObjectdoReadWithoutConversion(Object old, org.apache.avro.Schema expected, org.apache.avro.io.ResolvingDecoder in)static com.quartetfs.fwk.impl.Pair<Boolean,org.apache.avro.Schema.Type>extractSimpleArrayElementInfo(org.apache.avro.Schema arraySchema)Extract relevant information for the type of elements supported by a given simple Avro array.static com.quartetfs.fwk.impl.Pair<Boolean,org.apache.avro.Schema.Type>extractSimpleUnionInfo(org.apache.avro.Schema unionSchema)Assume a simple Union Avro type (a.k.a.voidlogStatistics()Log the current state of aggregated statistics.org.apache.avro.generic.GenericRecordread(org.apache.avro.generic.GenericRecord reuse, org.apache.avro.io.Decoder in)protected ObjectreadArray(Object old, org.apache.avro.Schema expected, org.apache.avro.io.ResolvingDecoder in)protected voidreadField(Object r, org.apache.avro.Schema.Field f, Object oldDatum, org.apache.avro.io.ResolvingDecoder in, Object state)protected ObjectreadRecord(Object old, org.apache.avro.Schema expected, org.apache.avro.io.ResolvingDecoder in)protected ObjectreadWithoutConversion(Object old, org.apache.avro.Schema expected, org.apache.avro.io.ResolvingDecoder in)-
Methods inherited from class org.apache.avro.generic.GenericDatumReader
addToArray, addToMap, convert, createBytes, createEnum, createFixed, createFixed, createString, findStringClass, getData, getExpected, getResolver, getSchema, newArray, newInstanceFromString, newMap, newRecord, peekArray, read, readBytes, readBytes, readEnum, readFixed, readInt, readMap, readMapKey, readString, readString, readWithConversion, setExpected, setSchema, skip
-
-
-
-
Field Detail
-
stats
protected static com.activeviam.io.data.source.EnhancedGenericDatumReader.Statistics stats
-
withDetailedMonitoring
protected final boolean withDetailedMonitoring
-
currentRecordSchema
protected org.apache.avro.Schema currentRecordSchema
-
currentField
protected org.apache.avro.Schema.Field currentField
-
-
Constructor Detail
-
EnhancedGenericDatumReader
public EnhancedGenericDatumReader(org.apache.avro.Schema schema)
Create an enhanced generic datum reader with detailed monitoring disabled.- Parameters:
schema- The expected schema of Avro records to read.
-
EnhancedGenericDatumReader
public EnhancedGenericDatumReader(org.apache.avro.Schema schema, boolean withDetailedMonitoring)Create an enhanced generic datum reader with optional enabling of detailed monitoring.- Parameters:
schema- The expected schema of Avro records to read.withDetailedMonitoring- Whether to enable detailed monitoring. (not recommended for prod usage)
-
EnhancedGenericDatumReader
public EnhancedGenericDatumReader(org.apache.avro.Schema schema, org.apache.avro.generic.GenericData data)Create an enhanced generic datum reader with explicit specification of theGenericDatainstance to leverage, and with detailed monitoring disabled.- Parameters:
schema- The expected schema of Avro records to read.data- The generic data instance to leverage.
-
EnhancedGenericDatumReader
public EnhancedGenericDatumReader(org.apache.avro.Schema schema, org.apache.avro.generic.GenericData data, boolean withDetailedMonitoring)Create an enhanced generic datum reader with explicit specification of theGenericDatainstance to leverage, and with optional enabling of detailed monitoring.- Parameters:
schema- The expected schema of Avro records to read.data- The generic data instance to leverage.withDetailedMonitoring- Whether to enable detailed monitoring. (not recommended for prod usage)
-
-
Method Detail
-
extractSimpleArrayElementInfo
public static com.quartetfs.fwk.impl.Pair<Boolean,org.apache.avro.Schema.Type> extractSimpleArrayElementInfo(org.apache.avro.Schema arraySchema)
Extract relevant information for the type of elements supported by a given simple Avro array.An simple Avro array is assumed to represent a collection of one and only one non-nullable element type, with optional support of element nullability.
That is, we may have an array of int elements w/o allowing any element to be null, or an array of int elements but also accepting also null elements, or an array full of null elements... but we cannot have an array that would contain both int and double elements.
- Parameters:
arraySchema- An Avro schema defining an array type.- Returns:
- Pair with left = boolean (element nullable or not), and right = non-null element type (if any... or
nullotherwise). - Throws:
RuntimeException- If not a simple array.- See Also:
extractSimpleUnionInfo(Schema)
-
extractSimpleUnionInfo
public static com.quartetfs.fwk.impl.Pair<Boolean,org.apache.avro.Schema.Type> extractSimpleUnionInfo(org.apache.avro.Schema unionSchema)
Assume a simple Union Avro type (a.k.a. no more than one non-null type) and extract relevant information (a.k.a. whether/not nullable, and the non-null type if any).- Parameters:
unionSchema- An Avro schema defining a Union type.- Returns:
- Pair with left = boolean (nullable or not), and right = non-null type (if any... or
nullotherwise). - Throws:
RuntimeException- If not a simple union.
-
logStatistics
public void logStatistics()
Log the current state of aggregated statistics.NOTICE - This effectively logs something only if detailed monitoring has been enabled (check constructor options), otherwise this is equivalent to a NOOP.
-
read
public final org.apache.avro.generic.GenericRecord read(org.apache.avro.generic.GenericRecord reuse, org.apache.avro.io.Decoder in) throws IOException- Specified by:
readin interfaceorg.apache.avro.io.DatumReader<org.apache.avro.generic.GenericRecord>- Overrides:
readin classorg.apache.avro.generic.GenericDatumReader<org.apache.avro.generic.GenericRecord>- Throws:
IOException
-
readRecord
protected final Object readRecord(Object old, org.apache.avro.Schema expected, org.apache.avro.io.ResolvingDecoder in) throws IOException
- Overrides:
readRecordin classorg.apache.avro.generic.GenericDatumReader<org.apache.avro.generic.GenericRecord>- Throws:
IOException
-
readField
protected final void readField(Object r, org.apache.avro.Schema.Field f, Object oldDatum, org.apache.avro.io.ResolvingDecoder in, Object state) throws IOException
- Overrides:
readFieldin classorg.apache.avro.generic.GenericDatumReader<org.apache.avro.generic.GenericRecord>- Throws:
IOException
-
readWithoutConversion
protected final Object readWithoutConversion(Object old, org.apache.avro.Schema expected, org.apache.avro.io.ResolvingDecoder in) throws IOException
- Overrides:
readWithoutConversionin classorg.apache.avro.generic.GenericDatumReader<org.apache.avro.generic.GenericRecord>- Throws:
IOException
-
readArray
protected final Object readArray(Object old, org.apache.avro.Schema expected, org.apache.avro.io.ResolvingDecoder in) throws IOException
- Overrides:
readArrayin classorg.apache.avro.generic.GenericDatumReader<org.apache.avro.generic.GenericRecord>- Throws:
IOException
-
doReadWithoutConversion
protected Object doReadWithoutConversion(Object old, org.apache.avro.Schema expected, org.apache.avro.io.ResolvingDecoder in) throws IOException
- Throws:
IOException
-
doReadArray
protected Object doReadArray(Object old, org.apache.avro.Schema expected, org.apache.avro.io.ResolvingDecoder in) throws IOException
- Throws:
IOException
-
-