Package com.clickhouse.data
Class ClickHouseDataProcessor
java.lang.Object
com.clickhouse.data.ClickHouseDataProcessor
This defines a data processor for dealing with serialization and
deserialization of one or multiple
ClickHouseFormat. Unlike
ClickHouseDeserializer and ClickHouseSerializer, which is for
specific column or data type, data processor is a combination of both, and it
can handle more scenarios like separator between columns and rows.-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionprotected static final classprotected static final class -
Field Summary
FieldsModifier and TypeFieldDescriptionprotected final ClickHouseDataConfigstatic final List<ClickHouseColumn>protected static final Stringprotected static final Stringprotected static final Stringprotected static final Stringprotected final Map<String,Serializable> protected final ClickHouseInputStreamprotected final ClickHouseOutputStreamprotected intprotected ClickHouseDataProcessor.DefaultSerDeprotected intColumn index shared bywrite(ClickHouseValue). -
Constructor Summary
ConstructorsModifierConstructorDescriptionprotectedClickHouseDataProcessor(ClickHouseDataConfig config, ClickHouseInputStream input, ClickHouseOutputStream output, List<ClickHouseColumn> columns, Map<String, Serializable> settings) Default constructor. -
Method Summary
Modifier and TypeMethodDescriptionprotected ClickHouseDeserializer[]Builds list of steps to deserialize value for the given column.protected ClickHouseSerializer[]buildSerializeSteps(ClickHouseColumn column) Builds list of steps to serialize value for the given column.final List<ClickHouseColumn>Gets list of columns to process.abstract ClickHouseDeserializergetDeserializer(ClickHouseDataConfig config, ClickHouseColumn column) final ClickHouseDeserializer[]getDeserializers(ClickHouseDataConfig config, List<ClickHouseColumn> columns) <T extends Serializable>
TgetExtraProperty(String key, Class<T> valueClass) Gets a typed extra property.protected final ClickHouseDataProcessor.DefaultSerDefinal ClickHouseInputStreamGets input stream.final ClickHouseOutputStreamGets output stream.abstract ClickHouseSerializergetSerializer(ClickHouseDataConfig config, ClickHouseColumn column) final ClickHouseSerializer[]getSerializers(ClickHouseDataConfig config, List<ClickHouseColumn> columns) booleanChecks whether the processor contains extra property.protected booleanChecks whether there's more to read from input stream.protected Iterator<ClickHouseRecord>Initializes iterator ofClickHouseRecordfor reading values record by record.protected Iterator<ClickHouseValue>Initializes iterator ofClickHouseValuefor reading values one by one.read(ClickHouseValue value) Reads deserialized value of next column(atreadPosition) directly from input stream.protected voidReads columns(starting fromreadPosition) from input stream and fill deserialized data into the given record.protected voidreadAndFill(ClickHouseValue value) Reads next column(atreadPositionfrom input stream and fill deserialized data into the given value object.protected abstract List<ClickHouseColumn>Reads columns from input stream.final Iterable<ClickHouseRecord>records()Returns an iterable collection of records which can be walked through in a foreach-loop.final <T> Iterable<T>Returns an iterable collection of mapped objects which can be walked through in a foreach loop.<T> Iterable<T>Returns an iterable collection of mapped objects which can be walked through in a foreach loop.final Iterable<ClickHouseValue>values()Returns an iterable collection of values which can be walked through in a foreach-loop.voidwrite(ClickHouseValue value) Writes serialized value of next column(atreadPosition) to output stream.
-
Field Details
-
DEFAULT_COLUMNS
-
ERROR_FAILED_TO_READ
- See Also:
-
ERROR_FAILED_TO_WRITE
- See Also:
-
ERROR_REACHED_END_OF_STREAM
- See Also:
-
ERROR_UNKNOWN_DATA_TYPE
- See Also:
-
config
-
input
-
output
-
extraProps
-
serde
-
readPosition
protected int readPosition -
writePosition
protected int writePositionColumn index shared bywrite(ClickHouseValue).
-
-
Constructor Details
-
ClickHouseDataProcessor
protected ClickHouseDataProcessor(ClickHouseDataConfig config, ClickHouseInputStream input, ClickHouseOutputStream output, List<ClickHouseColumn> columns, Map<String, Serializable> settings) throws IOExceptionDefault constructor.- Parameters:
config- non-null confinguration contains information like formatinput- input stream for deserialization, can be null whenoutputis availableoutput- outut stream for serialization, can be null wheninputis availablecolumns- nullable columnssettings- nullable settings- Throws:
IOException- when failed to read columns from input stream
-
-
Method Details
-
hasMoreToRead
Checks whether there's more to read from input stream.- Returns:
- true if there's more; false otherwise
- Throws:
UncheckedIOException- when failed to read data from input stream
-
buildDeserializeSteps
Builds list of steps to deserialize value for the given column.- Parameters:
column- non-null column- Returns:
- non-null list of steps for deserialization
-
buildSerializeSteps
Builds list of steps to serialize value for the given column.- Parameters:
column- non-null column- Returns:
- non-null list of steps for serialization
-
getInitializedSerDe
protected final ClickHouseDataProcessor.DefaultSerDe getInitializedSerDe() throws UncheckedIOException- Throws:
UncheckedIOException
-
initRecords
Initializes iterator ofClickHouseRecordfor reading values record by record. Usually this should be only called once during instantiation.- Returns:
- non-null iterator of
ClickHouseRecord
-
initValues
Initializes iterator ofClickHouseValuefor reading values one by one. Usually this should be only called once during instantiation.- Returns:
- non-null iterator of
ClickHouseValue
-
readAndFill
Reads columns(starting fromreadPosition) from input stream and fill deserialized data into the given record. This method is only used when iterating throughrecords().- Parameters:
r- non-null record to fill- Throws:
IOException- when failed to read columns from input stream
-
readAndFill
Reads next column(atreadPositionfrom input stream and fill deserialized data into the given value object. This method is mainly used when iterating throughvalues(). In default implementation, it's also used inreadAndFill(ClickHouseRecord)for simplicity.- Parameters:
value- non-null value object to fill- Throws:
IOException- when failed to read column from input stream
-
readColumns
Reads columns from input stream. Usually this will be only called once during instantiation.- Returns:
- non-null list of columns
- Throws:
IOException- when failed to read columns from input stream
-
hasExtraProperties
public boolean hasExtraProperties()Checks whether the processor contains extra property.- Returns:
- true if the processor has extra property; false otherwise
-
getExtraProperty
Gets a typed extra property.- Type Parameters:
T- type of the property value- Parameters:
key- key of the propertyvalueClass- non-null Java class of the property value- Returns:
- typed extra property, could be null
-
getDeserializer
public abstract ClickHouseDeserializer getDeserializer(ClickHouseDataConfig config, ClickHouseColumn column) -
getDeserializers
public final ClickHouseDeserializer[] getDeserializers(ClickHouseDataConfig config, List<ClickHouseColumn> columns) -
getSerializer
public abstract ClickHouseSerializer getSerializer(ClickHouseDataConfig config, ClickHouseColumn column) -
getSerializers
public final ClickHouseSerializer[] getSerializers(ClickHouseDataConfig config, List<ClickHouseColumn> columns) -
getColumns
Gets list of columns to process.- Returns:
- list of columns to process
-
getInputStream
Gets input stream.- Returns:
- input stream, could be null
-
getOutputStream
Gets output stream.- Returns:
- output stream, could be null
-
records
Returns an iterable collection of records which can be walked through in a foreach-loop. Please pay attention that: 1)UncheckedIOExceptionmight be thrown when iterating through the collection; and 2) it's not supposed to be called for more than once because the input stream will be closed at the end of reading.- Returns:
- non-null iterable records
- Throws:
UncheckedIOException- when failed to access the input stream
-
records
Returns an iterable collection of mapped objects which can be walked through in a foreach loop. Same asrecords(objClass, null).- Type Parameters:
T- type of the mapped object- Parameters:
objClass- non-null class of the mapped object- Returns:
- non-null iterable collection
- Throws:
UncheckedIOException- when failed to read data(e.g. deserialization)
-
records
Returns an iterable collection of mapped objects which can be walked through in a foreach loop. WhenobjClassis null orClickHouseRecord, this is same as callingrecords().- Type Parameters:
T- type of the mapped object- Parameters:
objClass- non-null class of the mapped objecttemplate- optional template object to reuse- Returns:
- non-null iterable collection
- Throws:
UncheckedIOException- when failed to read data(e.g. deserialization)
-
values
Returns an iterable collection of values which can be walked through in a foreach-loop. In general, this is slower thanrecords(), because the latter reads data in bulk. However, it's particular useful when you're reading large values with limited memory - e.g. a binary field with a few GB bytes. Similarly, the input stream will be closed at the end of reading.- Returns:
- non-null iterable values
- Throws:
UncheckedIOException- when failed to access the input stream
-
read
Reads deserialized value of next column(atreadPosition) directly from input stream. Unlikerecords(), which reads multiple values at a time, this method will only read one for each call.- Parameters:
value- value to update, could be null- Returns:
- updated
valueor a newClickHouseValuewhen it is null - Throws:
IOException- when failed to read data from input stream
-
write
Writes serialized value of next column(atreadPosition) to output stream.- Parameters:
value- non-null value to be serialized- Throws:
IOException- when failed to write data to output stream
-