OmniSciDB  ca0c39ec8f
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Macros Groups Pages
Encoder Class Referenceabstract

#include <Encoder.h>

+ Inheritance diagram for Encoder:
+ Collaboration diagram for Encoder:

Public Member Functions

 Encoder (Data_Namespace::AbstractBuffer *buffer)
 
virtual ~Encoder ()
 
virtual size_t getNumElemsForBytesEncodedDataAtIndices (const int8_t *index_data, const std::vector< size_t > &selected_idx, const size_t byte_limit)=0
 
virtual std::shared_ptr
< ChunkMetadata
appendEncodedDataAtIndices (const int8_t *index_data, int8_t *data, const std::vector< size_t > &selected_idx)=0
 
virtual std::shared_ptr
< ChunkMetadata
appendEncodedData (const int8_t *index_data, int8_t *data, const size_t start_idx, const size_t num_elements)=0
 
virtual std::shared_ptr
< ChunkMetadata
appendData (int8_t *&src_data, const size_t num_elems_to_append, const SQLTypeInfo &ti, const bool replicating=false, const int64_t offset=-1)=0
 
virtual void getMetadata (const std::shared_ptr< ChunkMetadata > &chunkMetadata)
 
virtual std::shared_ptr
< ChunkMetadata
getMetadata (const SQLTypeInfo &ti)=0
 
virtual void updateStats (const int64_t val, const bool is_null)=0
 
virtual void updateStats (const double val, const bool is_null)=0
 
virtual void updateStats (const int8_t *const src_data, const size_t num_elements)=0
 
virtual void updateStatsEncoded (const int8_t *const dst_data, const size_t num_elements)
 
virtual void updateStats (const std::vector< std::string > *const src_data, const size_t start_idx, const size_t num_elements)=0
 
virtual void updateStats (const std::vector< ArrayDatum > *const src_data, const size_t start_idx, const size_t num_elements)=0
 
virtual void reduceStats (const Encoder &)=0
 
virtual void copyMetadata (const Encoder *copyFromEncoder)=0
 
virtual void writeMetadata (FILE *f)=0
 
virtual void readMetadata (FILE *f)=0
 
virtual bool resetChunkStats (const ChunkStats &)
 : Reset chunk level stats (min, max, nulls) using new values from the argument. More...
 
virtual void resetChunkStats ()=0
 
size_t getNumElems () const
 
void setNumElems (const size_t num_elems)
 

Static Public Member Functions

static EncoderCreate (Data_Namespace::AbstractBuffer *buffer, const SQLTypeInfo sqlType)
 

Protected Attributes

size_t num_elems_
 
Data_Namespace::AbstractBufferbuffer_
 
DecimalOverflowValidator decimal_overflow_validator_
 
DateDaysOverflowValidator date_days_overflow_validator_
 

Detailed Description

Definition at line 146 of file Encoder.h.

Constructor & Destructor Documentation

Encoder::Encoder ( Data_Namespace::AbstractBuffer buffer)

Definition at line 225 of file Encoder.cpp.

226  : num_elems_(0)
227  , buffer_(buffer)
228  , decimal_overflow_validator_(buffer ? buffer->getSqlType() : SQLTypeInfo())
229  , date_days_overflow_validator_(buffer ? buffer->getSqlType() : SQLTypeInfo()){};
size_t num_elems_
Definition: Encoder.h:288
DecimalOverflowValidator decimal_overflow_validator_
Definition: Encoder.h:292
Data_Namespace::AbstractBuffer * buffer_
Definition: Encoder.h:290
DateDaysOverflowValidator date_days_overflow_validator_
Definition: Encoder.h:293
SQLTypeInfo getSqlType() const
virtual Encoder::~Encoder ( )
inlinevirtual

Definition at line 151 of file Encoder.h.

151 {}

Member Function Documentation

virtual std::shared_ptr<ChunkMetadata> Encoder::appendData ( int8_t *&  src_data,
const size_t  num_elems_to_append,
const SQLTypeInfo ti,
const bool  replicating = false,
const int64_t  offset = -1 
)
pure virtual

Append data to the chunk buffer backing this encoder.

Parameters
src_dataSource data for the append
num_elems_to_appendNumber of elements to append
tiSQL Type Info for the column TODO(adb): used?
replicatingPass one value and fill the chunk with it
offsetWrite data starting at a given offset. Default is -1 which indicates an append, an offset of 0 rewrites the chunk up to num_elems_to_append.

Implemented in FixedLengthArrayNoneEncoder, ArrayNoneEncoder, NoneEncoder< T >, FixedLengthEncoder< T, V >, DateDaysEncoder< T, V >, and StringNoneEncoder.

Referenced by foreign_storage::GeospatialEncoder::appendBaseAndRenderGroupDataAndUpdateMetadata(), and Chunk_NS::Chunk::appendData().

+ Here is the caller graph for this function:

virtual std::shared_ptr<ChunkMetadata> Encoder::appendEncodedData ( const int8_t *  index_data,
int8_t *  data,
const size_t  start_idx,
const size_t  num_elements 
)
pure virtual

Append encoded data to the chunk buffer backing this encoder.

Parameters
index_data- (optional) the index data of data to append
data- the data to append
start_idx- the position to start encoding from in the data array
num_elements- the number of elements to encode from the data array
Returns
updated chunk metadata for the chunk buffer backing this encoder

NOTE: index_data must be non-null for varlen encoder types.

Implemented in ArrayNoneEncoder, FixedLengthArrayNoneEncoder, NoneEncoder< T >, FixedLengthEncoder< T, V >, StringNoneEncoder, and DateDaysEncoder< T, V >.

Referenced by Chunk_NS::Chunk::appendEncodedData().

+ Here is the caller graph for this function:

virtual std::shared_ptr<ChunkMetadata> Encoder::appendEncodedDataAtIndices ( const int8_t *  index_data,
int8_t *  data,
const std::vector< size_t > &  selected_idx 
)
pure virtual

Append selected encoded data to the chunk buffer backing this encoder.

Parameters
index_data- (optional) the index data of data to append
data- the data to append
selected_idx- which indices in the encoded data to append
Returns
updated chunk metadata for the chunk buffer backing this encoder

NOTE: index_data must be non-null for varlen encoder types.

Implemented in ArrayNoneEncoder, FixedLengthArrayNoneEncoder, StringNoneEncoder, NoneEncoder< T >, FixedLengthEncoder< T, V >, and DateDaysEncoder< T, V >.

Referenced by Chunk_NS::Chunk::appendEncodedDataAtIndices().

+ Here is the caller graph for this function:

virtual void Encoder::copyMetadata ( const Encoder copyFromEncoder)
pure virtual
Encoder * Encoder::Create ( Data_Namespace::AbstractBuffer buffer,
const SQLTypeInfo  sqlType 
)
static

Definition at line 26 of file Encoder.cpp.

References CHECK, SQLTypeInfo::get_comp_param(), SQLTypeInfo::get_compression(), SQLTypeInfo::get_size(), SQLTypeInfo::get_subtype(), SQLTypeInfo::get_type(), IS_STRING, SQLTypeInfo::is_string(), kARRAY, kBIGINT, kBOOLEAN, kCHAR, kDATE, kDECIMAL, kDOUBLE, kENCODING_DATE_IN_DAYS, kENCODING_DICT, kENCODING_FIXED, kENCODING_GEOINT, kENCODING_NONE, kFLOAT, kINT, kLINESTRING, kMULTILINESTRING, kMULTIPOINT, kMULTIPOLYGON, kNUMERIC, kPOINT, kPOLYGON, kSMALLINT, kTEXT, kTIME, kTIMESTAMP, kTINYINT, and kVARCHAR.

Referenced by Data_Namespace::AbstractBuffer::initEncoder(), and synthesize_metadata().

27  {
28  switch (sqlType.get_compression()) {
29  case kENCODING_NONE: {
30  switch (sqlType.get_type()) {
31  case kBOOLEAN: {
32  return new NoneEncoder<int8_t>(buffer);
33  break;
34  }
35  case kTINYINT: {
36  return new NoneEncoder<int8_t>(buffer);
37  break;
38  }
39  case kSMALLINT: {
40  return new NoneEncoder<int16_t>(buffer);
41  break;
42  }
43  case kINT: {
44  return new NoneEncoder<int32_t>(buffer);
45  break;
46  }
47  case kBIGINT:
48  case kNUMERIC:
49  case kDECIMAL: {
50  return new NoneEncoder<int64_t>(buffer);
51  break;
52  }
53  case kFLOAT: {
54  return new NoneEncoder<float>(buffer);
55  break;
56  }
57  case kDOUBLE: {
58  return new NoneEncoder<double>(buffer);
59  break;
60  }
61  case kTEXT:
62  case kVARCHAR:
63  case kCHAR:
64  return new StringNoneEncoder(buffer);
65  case kARRAY: {
66  if (sqlType.get_size() > 0) {
67  return new FixedLengthArrayNoneEncoder(buffer, sqlType.get_size());
68  }
69  return new ArrayNoneEncoder(buffer);
70  }
71  case kTIME:
72  case kTIMESTAMP:
73  case kDATE:
74  return new NoneEncoder<int64_t>(buffer);
75  case kPOINT:
76  case kMULTIPOINT:
77  case kLINESTRING:
78  case kMULTILINESTRING:
79  case kPOLYGON:
80  case kMULTIPOLYGON:
81  return new StringNoneEncoder(buffer);
82  default: {
83  return 0;
84  }
85  }
86  break;
87  }
89  switch (sqlType.get_type()) {
90  case kDATE:
91  switch (sqlType.get_comp_param()) {
92  case 0:
93  case 32:
94  return new DateDaysEncoder<int64_t, int32_t>(buffer);
95  break;
96  case 16:
97  return new DateDaysEncoder<int64_t, int16_t>(buffer);
98  break;
99  default:
100  return 0;
101  break;
102  }
103  break;
104  default:
105  return 0;
106  break;
107  }
108  }
109  case kENCODING_FIXED: {
110  switch (sqlType.get_type()) {
111  case kSMALLINT: {
112  switch (sqlType.get_comp_param()) {
113  case 8:
114  return new FixedLengthEncoder<int16_t, int8_t>(buffer);
115  break;
116  case 16:
117  return new NoneEncoder<int16_t>(buffer);
118  break;
119  default:
120  return 0;
121  break;
122  }
123  break;
124  }
125  case kINT: {
126  switch (sqlType.get_comp_param()) {
127  case 8:
128  return new FixedLengthEncoder<int32_t, int8_t>(buffer);
129  break;
130  case 16:
131  return new FixedLengthEncoder<int32_t, int16_t>(buffer);
132  break;
133  case 32:
134  return new NoneEncoder<int32_t>(buffer);
135  break;
136  default:
137  return 0;
138  break;
139  }
140  break;
141  }
142  case kBIGINT:
143  case kNUMERIC:
144  case kDECIMAL: {
145  switch (sqlType.get_comp_param()) {
146  case 8:
147  return new FixedLengthEncoder<int64_t, int8_t>(buffer);
148  break;
149  case 16:
150  return new FixedLengthEncoder<int64_t, int16_t>(buffer);
151  break;
152  case 32:
153  return new FixedLengthEncoder<int64_t, int32_t>(buffer);
154  break;
155  case 64:
156  return new NoneEncoder<int64_t>(buffer);
157  break;
158  default:
159  return 0;
160  break;
161  }
162  break;
163  }
164  case kTIME:
165  case kTIMESTAMP:
166  case kDATE:
167  return new FixedLengthEncoder<int64_t, int32_t>(buffer);
168  break;
169  default: {
170  return 0;
171  break;
172  }
173  } // switch (sqlType)
174  break;
175  } // Case: kENCODING_FIXED
176  case kENCODING_DICT: {
177  if (sqlType.get_type() == kARRAY) {
178  CHECK(IS_STRING(sqlType.get_subtype()));
179  if (sqlType.get_size() > 0) {
180  return new FixedLengthArrayNoneEncoder(buffer, sqlType.get_size());
181  }
182  return new ArrayNoneEncoder(buffer);
183  } else {
184  CHECK(sqlType.is_string());
185  switch (sqlType.get_size()) {
186  case 1:
187  return new NoneEncoder<uint8_t>(buffer);
188  break;
189  case 2:
190  return new NoneEncoder<uint16_t>(buffer);
191  break;
192  case 4:
193  return new NoneEncoder<int32_t>(buffer);
194  break;
195  default:
196  CHECK(false);
197  break;
198  }
199  }
200  break;
201  }
202  case kENCODING_GEOINT: {
203  switch (sqlType.get_type()) {
204  case kPOINT:
205  case kMULTIPOINT:
206  case kLINESTRING:
207  case kMULTILINESTRING:
208  case kPOLYGON:
209  case kMULTIPOLYGON:
210  return new StringNoneEncoder(buffer);
211  default: {
212  return 0;
213  }
214  }
215  break;
216  }
217  default: {
218  return 0;
219  break;
220  }
221  } // switch (encodingType)
222  return 0;
223 }
HOST DEVICE SQLTypes get_subtype() const
Definition: sqltypes.h:380
HOST DEVICE int get_size() const
Definition: sqltypes.h:389
Definition: sqltypes.h:64
HOST DEVICE SQLTypes get_type() const
Definition: sqltypes.h:379
Definition: sqltypes.h:67
Definition: sqltypes.h:68
HOST DEVICE EncodingType get_compression() const
Definition: sqltypes.h:387
Definition: sqltypes.h:56
#define IS_STRING(T)
Definition: sqltypes.h:297
HOST DEVICE int get_comp_param() const
Definition: sqltypes.h:388
#define CHECK(condition)
Definition: Logger.h:222
Definition: sqltypes.h:60
bool is_string() const
Definition: sqltypes.h:575

+ Here is the call graph for this function:

+ Here is the caller graph for this function:

void Encoder::getMetadata ( const std::shared_ptr< ChunkMetadata > &  chunkMetadata)
virtual

Reimplemented in ArrayNoneEncoder, FixedLengthArrayNoneEncoder, NoneEncoder< T >, FixedLengthEncoder< T, V >, DateDaysEncoder< T, V >, and StringNoneEncoder.

Definition at line 231 of file Encoder.cpp.

References buffer_, Data_Namespace::AbstractBuffer::getSqlType(), num_elems_, and Data_Namespace::AbstractBuffer::size().

Referenced by foreign_storage::ForeignStorageCache::cacheMetadataVec(), foreign_storage::get_placeholder_metadata(), foreign_storage::Csv::get_placeholder_metadata(), StringNoneEncoder::getMetadata(), DateDaysEncoder< T, V >::getMetadata(), FixedLengthEncoder< T, V >::getMetadata(), NoneEncoder< T >::getMetadata(), FixedLengthArrayNoneEncoder::getMetadata(), and ArrayNoneEncoder::getMetadata().

231  {
232  chunkMetadata->sqlType = buffer_->getSqlType();
233  chunkMetadata->numBytes = buffer_->size();
234  chunkMetadata->numElements = num_elems_;
235 }
size_t num_elems_
Definition: Encoder.h:288
Data_Namespace::AbstractBuffer * buffer_
Definition: Encoder.h:290
SQLTypeInfo getSqlType() const

+ Here is the call graph for this function:

+ Here is the caller graph for this function:

virtual std::shared_ptr<ChunkMetadata> Encoder::getMetadata ( const SQLTypeInfo ti)
pure virtual
size_t Encoder::getNumElems ( ) const
inline

Definition at line 284 of file Encoder.h.

References num_elems_.

Referenced by Fragmenter_Namespace::compute_row_indices_of_shards(), StringNoneEncoder::copyMetadata(), DateDaysEncoder< T, V >::copyMetadata(), FixedLengthEncoder< T, V >::copyMetadata(), NoneEncoder< T >::copyMetadata(), FixedLengthArrayNoneEncoder::copyMetadata(), and ArrayNoneEncoder::copyMetadata().

284 { return num_elems_; }
size_t num_elems_
Definition: Encoder.h:288

+ Here is the caller graph for this function:

virtual size_t Encoder::getNumElemsForBytesEncodedDataAtIndices ( const int8_t *  index_data,
const std::vector< size_t > &  selected_idx,
const size_t  byte_limit 
)
pure virtual

Compute the maximum number of variable length encoded elements given a byte limit

Parameters
index_data- (optional) index data for the encoded type
selected_idx- which indices in the encoded data to consider
byte_limit- byte limit that must be respected
Returns
the number of elements

NOTE: optional parameters above may be ignored by the implementation, but may or may not be required depending on the encoder type backing the implementation.

Implemented in ArrayNoneEncoder, StringNoneEncoder, FixedLengthArrayNoneEncoder, NoneEncoder< T >, FixedLengthEncoder< T, V >, and DateDaysEncoder< T, V >.

Referenced by Chunk_NS::Chunk::getNumElemsForBytesEncodedDataAtIndices().

+ Here is the caller graph for this function:

virtual void Encoder::readMetadata ( FILE *  f)
pure virtual
virtual void Encoder::reduceStats ( const Encoder )
pure virtual
virtual bool Encoder::resetChunkStats ( const ChunkStats )
inlinevirtual

: Reset chunk level stats (min, max, nulls) using new values from the argument.

Returns
: True if an update occurred and the chunk needs to be flushed. False otherwise. Default false if metadata update is unsupported. Only reset chunk stats if the incoming stats differ from the current stats.

Reimplemented in ArrayNoneEncoder, FixedLengthArrayNoneEncoder, FixedLengthEncoder< T, V >, NoneEncoder< T >, DateDaysEncoder< T, V >, and StringNoneEncoder.

Definition at line 274 of file Encoder.h.

References UNREACHABLE.

Referenced by foreign_storage::anonymous_namespace{ParquetDataWrapper.cpp}::reduce_metadata(), and foreign_storage::anonymous_namespace{ForeignStorageCache.cpp}::set_metadata_for_buffer().

274  {
275  UNREACHABLE() << "Attempting to reset stats for unsupported type.";
276  return false;
277  }
#define UNREACHABLE()
Definition: Logger.h:266

+ Here is the caller graph for this function:

virtual void Encoder::resetChunkStats ( )
pure virtual

Resets chunk metadata stats to their default values.

Implemented in ArrayNoneEncoder, FixedLengthArrayNoneEncoder, FixedLengthEncoder< T, V >, NoneEncoder< T >, DateDaysEncoder< T, V >, and StringNoneEncoder.

void Encoder::setNumElems ( const size_t  num_elems)
inline

Definition at line 285 of file Encoder.h.

References num_elems_.

Referenced by DBHandler::insert_chunks(), and foreign_storage::anonymous_namespace{ForeignStorageCache.cpp}::set_metadata_for_buffer().

285 { num_elems_ = num_elems; }
size_t num_elems_
Definition: Encoder.h:288

+ Here is the caller graph for this function:

virtual void Encoder::updateStats ( const int64_t  val,
const bool  is_null 
)
pure virtual

Implemented in ArrayNoneEncoder, FixedLengthArrayNoneEncoder, NoneEncoder< T >, FixedLengthEncoder< T, V >, DateDaysEncoder< T, V >, and StringNoneEncoder.

Referenced by foreign_storage::update_stats().

+ Here is the caller graph for this function:

virtual void Encoder::updateStats ( const double  val,
const bool  is_null 
)
pure virtual
virtual void Encoder::updateStats ( const int8_t *const  src_data,
const size_t  num_elements 
)
pure virtual

Update statistics for data without appending.

Parameters
src_data- the data with which to update statistics
num_elements- the number of elements to scan in the data

Implemented in ArrayNoneEncoder, FixedLengthArrayNoneEncoder, NoneEncoder< T >, FixedLengthEncoder< T, V >, DateDaysEncoder< T, V >, and StringNoneEncoder.

virtual void Encoder::updateStats ( const std::vector< std::string > *const  src_data,
const size_t  start_idx,
const size_t  num_elements 
)
pure virtual

Update statistics for string data without appending.

Parameters
src_data- the string data with which to update statistics
start_idx- the offset into src_data to start the update
num_elements- the number of elements to scan in the string data

Implemented in ArrayNoneEncoder, FixedLengthArrayNoneEncoder, FixedLengthEncoder< T, V >, NoneEncoder< T >, DateDaysEncoder< T, V >, and StringNoneEncoder.

virtual void Encoder::updateStats ( const std::vector< ArrayDatum > *const  src_data,
const size_t  start_idx,
const size_t  num_elements 
)
pure virtual

Update statistics for array data without appending.

Parameters
src_data- the array data with which to update statistics
start_idx- the offset into src_data to start the update
num_elements- the number of elements to scan in the array data

Implemented in ArrayNoneEncoder, FixedLengthArrayNoneEncoder, FixedLengthEncoder< T, V >, NoneEncoder< T >, DateDaysEncoder< T, V >, and StringNoneEncoder.

virtual void Encoder::updateStatsEncoded ( const int8_t *const  dst_data,
const size_t  num_elements 
)
inlinevirtual

Update statistics for encoded data without appending.

Parameters
dst_data- the data with which to update statistics
num_elements- the number of elements to scan in the data

Reimplemented in FixedLengthEncoder< T, V >, and NoneEncoder< T >.

Definition at line 236 of file Encoder.h.

References UNREACHABLE.

237  {
238  UNREACHABLE();
239  }
#define UNREACHABLE()
Definition: Logger.h:266
virtual void Encoder::writeMetadata ( FILE *  f)
pure virtual

Member Data Documentation

DateDaysOverflowValidator Encoder::date_days_overflow_validator_
protected

Definition at line 293 of file Encoder.h.

Referenced by DateDaysEncoder< T, V >::encodeDataAndUpdateStats().


The documentation for this class was generated from the following files: