OmniSciDB  72c90bc290
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Macros Groups Pages
StringNoneEncoder Class Reference

#include <StringNoneEncoder.h>

+ Inheritance diagram for StringNoneEncoder:
+ Collaboration diagram for StringNoneEncoder:

Public Member Functions

 StringNoneEncoder (AbstractBuffer *buffer)
 
size_t getNumElemsForBytesInsertData (const std::vector< std::string > *srcData, const int start_idx, const size_t numAppendElems, const size_t byteLimit, const bool replicating=false)
 
size_t getNumElemsForBytesEncodedDataAtIndices (const int8_t *index_data, const std::vector< size_t > &selected_idx, const size_t byte_limit) override
 
std::shared_ptr< ChunkMetadataappendData (int8_t *&src_data, const size_t num_elems_to_append, const SQLTypeInfo &ti, const bool replicating=false, const int64_t offset=-1) override
 
std::shared_ptr< ChunkMetadataappendEncodedDataAtIndices (const int8_t *index_data, int8_t *data, const std::vector< size_t > &selected_idx) override
 
std::shared_ptr< ChunkMetadataappendEncodedData (const int8_t *index_data, int8_t *data, const size_t start_idx, const size_t num_elements) override
 
template<typename StringType >
std::shared_ptr< ChunkMetadataappendData (const StringType *srcData, const int start_idx, const size_t numAppendElems, const bool replicating=false)
 
template<typename StringType >
std::shared_ptr< ChunkMetadataappendData (const std::vector< StringType > *srcData, const int start_idx, const size_t numAppendElems, const bool replicating=false)
 
void getMetadata (const std::shared_ptr< ChunkMetadata > &chunkMetadata) override
 
std::shared_ptr< ChunkMetadatagetMetadata (const SQLTypeInfo &ti) override
 
void updateStats (const int64_t, const bool) override
 
void updateStats (const double, const bool) override
 
void updateStats (const int8_t *const src_data, const size_t num_elements) override
 
void updateStats (const std::vector< std::string > *const src_data, const size_t start_idx, const size_t num_elements) override
 
void updateStats (const std::vector< ArrayDatum > *const src_data, const size_t start_idx, const size_t num_elements) override
 
void reduceStats (const Encoder &) override
 
void writeMetadata (FILE *f) override
 
void readMetadata (FILE *f) override
 
void copyMetadata (const Encoder *copyFromEncoder) override
 
AbstractBuffergetIndexBuf () const
 
void setIndexBuffer (AbstractBuffer *buf)
 
bool resetChunkStats (const ChunkStats &stats) override
 : Reset chunk level stats (min, max, nulls) using new values from the argument. More...
 
void resetChunkStats () override
 
- Public Member Functions inherited from Encoder
 Encoder (Data_Namespace::AbstractBuffer *buffer)
 
virtual ~Encoder ()
 
virtual void updateStatsEncoded (const int8_t *const dst_data, const size_t num_elements)
 
size_t getNumElems () const
 
void setNumElems (const size_t num_elems)
 

Static Public Member Functions

static std::string_view getStringAtIndex (const int8_t *index_data, const int8_t *data, size_t index)
 
- Static Public Member Functions inherited from Encoder
static EncoderCreate (Data_Namespace::AbstractBuffer *buffer, const SQLTypeInfo sqlType)
 

Private Member Functions

template<typename StringType >
void update_elem_stats (const StringType &elem)
 

Static Private Member Functions

static std::pair
< StringOffsetT, StringOffsetT
getStringOffsets (const int8_t *index_data, size_t index)
 
static size_t getStringSizeAtIndex (const int8_t *index_data, size_t index)
 

Private Attributes

AbstractBufferindex_buf
 
StringOffsetT last_offset
 
bool has_nulls
 

Additional Inherited Members

- Protected Attributes inherited from Encoder
size_t num_elems_
 
Data_Namespace::AbstractBufferbuffer_
 
DecimalOverflowValidator decimal_overflow_validator_
 
DateDaysOverflowValidator date_days_overflow_validator_
 

Detailed Description

Definition at line 36 of file StringNoneEncoder.h.

Constructor & Destructor Documentation

StringNoneEncoder::StringNoneEncoder ( AbstractBuffer buffer)
inline

Definition at line 38 of file StringNoneEncoder.h.

39  : Encoder(buffer), index_buf(nullptr), last_offset(-1), has_nulls(false) {}
AbstractBuffer * index_buf
Encoder(Data_Namespace::AbstractBuffer *buffer)
Definition: Encoder.cpp:225
StringOffsetT last_offset

Member Function Documentation

std::shared_ptr<ChunkMetadata> StringNoneEncoder::appendData ( int8_t *&  src_data,
const size_t  num_elems_to_append,
const SQLTypeInfo ti,
const bool  replicating = false,
const int64_t  offset = -1 
)
inlineoverridevirtual

Append data to the chunk buffer backing this encoder.

Parameters
src_dataSource data for the append
num_elems_to_appendNumber of elements to append
tiSQL Type Info for the column TODO(adb): used?
replicatingPass one value and fill the chunk with it
offsetWrite data starting at a given offset. Default is -1 which indicates an append, an offset of 0 rewrites the chunk up to num_elems_to_append.

Implements Encoder.

Definition at line 51 of file StringNoneEncoder.h.

References UNREACHABLE.

Referenced by foreign_storage::GeospatialEncoder::appendBaseDataAndUpdateMetadata(), appendData(), Chunk_NS::Chunk::appendData(), appendEncodedData(), appendEncodedDataAtIndices(), and data_conversion::StringViewToStringNoneEncoder::encodeAndAppendData().

55  {
56  UNREACHABLE(); // should never be called for strings
57  return nullptr;
58  }
#define UNREACHABLE()
Definition: Logger.h:338

+ Here is the caller graph for this function:

template<typename StringType >
std::shared_ptr< ChunkMetadata > StringNoneEncoder::appendData ( const StringType *  srcData,
const int  start_idx,
const size_t  numAppendElems,
const bool  replicating = false 
)

Definition at line 102 of file StringNoneEncoder.cpp.

References Data_Namespace::AbstractBuffer::append(), Encoder::buffer_, CHECK, CHECK_GE, gpu_enabled::copy(), Data_Namespace::CPU_LEVEL, run_benchmark_import::dest, getMetadata(), index_buf, Data_Namespace::AbstractBuffer::isDirty(), last_offset, MAX_INPUT_BUF_SIZE, anonymous_namespace{Utm.h}::n, Encoder::num_elems_, Data_Namespace::AbstractBuffer::read(), Data_Namespace::AbstractBuffer::reserve(), Data_Namespace::AbstractBuffer::setDirty(), Data_Namespace::AbstractBuffer::size(), and update_elem_stats().

105  {
106  CHECK(index_buf); // index_buf must be set before this.
107  size_t append_index_size = numAppendElems * sizeof(StringOffsetT);
108  if (num_elems_ == 0) {
109  append_index_size += sizeof(StringOffsetT); // plus one for the initial offset of 0.
110  }
111  index_buf->reserve(index_buf->size() + append_index_size);
112  StringOffsetT offset = 0;
113  if (num_elems_ == 0) {
114  index_buf->append((int8_t*)&offset,
115  sizeof(StringOffsetT)); // write the inital 0 offset
116  last_offset = 0;
117  } else {
118  // always need to read a valid last offset from buffer/disk
119  // b/c now due to vacuum "last offset" may go backward and if
120  // index chunk was not reloaded last_offset would go way off!
121  index_buf->read((int8_t*)&last_offset,
122  sizeof(StringOffsetT),
123  index_buf->size() - sizeof(StringOffsetT),
125  CHECK_GE(last_offset, 0);
126  }
127  size_t append_data_size = 0;
128  for (size_t n = start_idx; n < start_idx + numAppendElems; n++) {
129  size_t len = (srcData)[replicating ? 0 : n].length();
130  append_data_size += len;
131  }
132  buffer_->reserve(buffer_->size() + append_data_size);
133 
134  size_t inbuf_size =
135  std::min(std::max(append_index_size, append_data_size), (size_t)MAX_INPUT_BUF_SIZE);
136  auto inbuf = std::make_unique<int8_t[]>(inbuf_size);
137  for (size_t num_appended = 0; num_appended < numAppendElems;) {
138  StringOffsetT* p = reinterpret_cast<StringOffsetT*>(inbuf.get());
139  size_t i;
140  for (i = 0; num_appended < numAppendElems && i < inbuf_size / sizeof(StringOffsetT);
141  i++, num_appended++) {
142  p[i] = last_offset + (srcData)[replicating ? 0 : num_appended + start_idx].length();
143  last_offset = p[i];
144  }
145  index_buf->append(inbuf.get(), i * sizeof(StringOffsetT));
146  }
147 
148  for (size_t num_appended = 0; num_appended < numAppendElems;) {
149  size_t size = 0;
150  for (int i = start_idx + num_appended;
151  num_appended < numAppendElems && size < inbuf_size;
152  i++, num_appended++) {
153  size_t len = (srcData)[replicating ? 0 : i].length();
154  if (len > inbuf_size) {
155  // for large strings, append on its own
156  if (size > 0) {
157  buffer_->append(inbuf.get(), size);
158  }
159  size = 0;
160  buffer_->append((int8_t*)(srcData)[replicating ? 0 : i].data(), len);
161  num_appended++;
162  break;
163  } else if (size + len > inbuf_size) {
164  break;
165  }
166  char* dest = reinterpret_cast<char*>(inbuf.get()) + size;
167  if (len > 0) {
168  (srcData)[replicating ? 0 : i].copy(dest, len);
169  size += len;
170  }
171  update_elem_stats((srcData)[replicating ? 0 : i]);
172  }
173  if (size > 0) {
174  buffer_->append(inbuf.get(), size);
175  }
176  }
177  // make sure buffer_ is flushed even if no new data is appended to it
178  // (e.g. empty strings) because the metadata needs to be flushed.
179  if (!buffer_->isDirty()) {
180  buffer_->setDirty();
181  }
182 
183  num_elems_ += numAppendElems;
184  auto chunk_metadata = std::make_shared<ChunkMetadata>();
185  getMetadata(chunk_metadata);
186  return chunk_metadata;
187 }
size_t num_elems_
Definition: Encoder.h:288
#define MAX_INPUT_BUF_SIZE
Definition: Encoder.h:36
#define CHECK_GE(x, y)
Definition: Logger.h:306
int32_t StringOffsetT
Definition: sqltypes.h:1493
virtual void read(int8_t *const dst, const size_t num_bytes, const size_t offset=0, const MemoryLevel dst_buffer_type=CPU_LEVEL, const int dst_device_id=-1)=0
AbstractBuffer * index_buf
Data_Namespace::AbstractBuffer * buffer_
Definition: Encoder.h:290
DEVICE auto copy(ARGS &&...args)
Definition: gpu_enabled.h:51
void update_elem_stats(const StringType &elem)
StringOffsetT last_offset
void getMetadata(const std::shared_ptr< ChunkMetadata > &chunkMetadata) override
virtual void append(int8_t *src, const size_t num_bytes, const MemoryLevel src_buffer_type=CPU_LEVEL, const int device_id=-1)=0
#define CHECK(condition)
Definition: Logger.h:291
constexpr double n
Definition: Utm.h:38
virtual void reserve(size_t num_bytes)=0

+ Here is the call graph for this function:

template<typename StringType >
std::shared_ptr< ChunkMetadata > StringNoneEncoder::appendData ( const std::vector< StringType > *  srcData,
const int  start_idx,
const size_t  numAppendElems,
const bool  replicating = false 
)

Definition at line 93 of file StringNoneEncoder.cpp.

References appendData().

97  {
98  return appendData(srcData->data(), start_idx, numAppendElems, replicating);
99 }
std::shared_ptr< ChunkMetadata > appendData(int8_t *&src_data, const size_t num_elems_to_append, const SQLTypeInfo &ti, const bool replicating=false, const int64_t offset=-1) override

+ Here is the call graph for this function:

std::shared_ptr< ChunkMetadata > StringNoneEncoder::appendEncodedData ( const int8_t *  index_data,
int8_t *  data,
const size_t  start_idx,
const size_t  num_elements 
)
overridevirtual

Append encoded data to the chunk buffer backing this encoder.

Parameters
index_data- (optional) the index data of data to append
data- the data to append
start_idx- the position to start encoding from in the data array
num_elements- the number of elements to encode from the data array
Returns
updated chunk metadata for the chunk buffer backing this encoder

NOTE: index_data must be non-null for varlen encoder types.

Implements Encoder.

Definition at line 78 of file StringNoneEncoder.cpp.

References appendData(), and getStringAtIndex().

82  {
83  std::vector<std::string_view> data_subset;
84  data_subset.reserve(num_elements);
85  for (size_t count = 0; count < num_elements; ++count) {
86  auto current_index = start_idx + count;
87  data_subset.emplace_back(getStringAtIndex(index_data, data, current_index));
88  }
89  return appendData(&data_subset, 0, num_elements, false);
90 }
static std::string_view getStringAtIndex(const int8_t *index_data, const int8_t *data, size_t index)
std::shared_ptr< ChunkMetadata > appendData(int8_t *&src_data, const size_t num_elems_to_append, const SQLTypeInfo &ti, const bool replicating=false, const int64_t offset=-1) override

+ Here is the call graph for this function:

std::shared_ptr< ChunkMetadata > StringNoneEncoder::appendEncodedDataAtIndices ( const int8_t *  index_data,
int8_t *  data,
const std::vector< size_t > &  selected_idx 
)
overridevirtual

Append selected encoded data to the chunk buffer backing this encoder.

Parameters
index_data- (optional) the index data of data to append
data- the data to append
selected_idx- which indices in the encoded data to append
Returns
updated chunk metadata for the chunk buffer backing this encoder

NOTE: index_data must be non-null for varlen encoder types.

Implements Encoder.

Definition at line 66 of file StringNoneEncoder.cpp.

References appendData(), and getStringAtIndex().

69  {
70  std::vector<std::string_view> data_subset;
71  data_subset.reserve(selected_idx.size());
72  for (const auto& offset_index : selected_idx) {
73  data_subset.emplace_back(getStringAtIndex(index_data, data, offset_index));
74  }
75  return appendData(&data_subset, 0, selected_idx.size(), false);
76 }
static std::string_view getStringAtIndex(const int8_t *index_data, const int8_t *data, size_t index)
std::shared_ptr< ChunkMetadata > appendData(int8_t *&src_data, const size_t num_elems_to_append, const SQLTypeInfo &ti, const bool replicating=false, const int64_t offset=-1) override

+ Here is the call graph for this function:

void StringNoneEncoder::copyMetadata ( const Encoder copyFromEncoder)
inlineoverridevirtual

Implements Encoder.

Definition at line 119 of file StringNoneEncoder.h.

References Encoder::getNumElems(), has_nulls, and Encoder::num_elems_.

119  {
120  num_elems_ = copyFromEncoder->getNumElems();
121  has_nulls = static_cast<const StringNoneEncoder*>(copyFromEncoder)->has_nulls;
122  }
size_t num_elems_
Definition: Encoder.h:288
size_t getNumElems() const
Definition: Encoder.h:284

+ Here is the call graph for this function:

AbstractBuffer* StringNoneEncoder::getIndexBuf ( ) const
inline

Definition at line 124 of file StringNoneEncoder.h.

References index_buf.

124 { return index_buf; }
AbstractBuffer * index_buf
void StringNoneEncoder::getMetadata ( const std::shared_ptr< ChunkMetadata > &  chunkMetadata)
overridevirtual

Reimplemented from Encoder.

Definition at line 262 of file StringNoneEncoder.cpp.

References Encoder::getMetadata(), and has_nulls.

Referenced by appendData().

262  {
263  Encoder::getMetadata(chunkMetadata); // call on parent class
264  chunkMetadata->chunkStats.min.stringval = nullptr;
265  chunkMetadata->chunkStats.max.stringval = nullptr;
266  chunkMetadata->chunkStats.has_nulls = has_nulls;
267 }
virtual void getMetadata(const std::shared_ptr< ChunkMetadata > &chunkMetadata)
Definition: Encoder.cpp:231

+ Here is the call graph for this function:

+ Here is the caller graph for this function:

std::shared_ptr< ChunkMetadata > StringNoneEncoder::getMetadata ( const SQLTypeInfo ti)
overridevirtual

Implements Encoder.

Definition at line 270 of file StringNoneEncoder.cpp.

References has_nulls, ChunkStats::min, and Datum::stringval.

270  {
271  auto chunk_stats = ChunkStats{};
272  chunk_stats.min.stringval = nullptr;
273  chunk_stats.max.stringval = nullptr;
274  chunk_stats.has_nulls = has_nulls;
275  return std::make_shared<ChunkMetadata>(ti, 0, 0, chunk_stats);
276 }
std::string * stringval
Definition: Datum.h:79
size_t StringNoneEncoder::getNumElemsForBytesEncodedDataAtIndices ( const int8_t *  index_data,
const std::vector< size_t > &  selected_idx,
const size_t  byte_limit 
)
overridevirtual

Compute the maximum number of variable length encoded elements given a byte limit

Parameters
index_data- (optional) index data for the encoded type
selected_idx- which indices in the encoded data to consider
byte_limit- byte limit that must be respected
Returns
the number of elements

NOTE: optional parameters above may be ignored by the implementation, but may or may not be required depending on the encoder type backing the implementation.

Implements Encoder.

Definition at line 49 of file StringNoneEncoder.cpp.

References getStringSizeAtIndex().

52  {
53  size_t num_elements = 0;
54  size_t data_size = 0;
55  for (const auto& offset_index : selected_idx) {
56  auto element_size = getStringSizeAtIndex(index_data, offset_index);
57  if (data_size + element_size > byte_limit) {
58  break;
59  }
60  data_size += element_size;
61  num_elements++;
62  }
63  return num_elements;
64 }
static size_t getStringSizeAtIndex(const int8_t *index_data, size_t index)

+ Here is the call graph for this function:

size_t StringNoneEncoder::getNumElemsForBytesInsertData ( const std::vector< std::string > *  srcData,
const int  start_idx,
const size_t  numAppendElems,
const size_t  byteLimit,
const bool  replicating = false 
)

Definition at line 31 of file StringNoneEncoder.cpp.

References anonymous_namespace{Utm.h}::n.

Referenced by Chunk_NS::Chunk::getNumElemsForBytesInsertData().

36  {
37  size_t dataSize = 0;
38  size_t n = start_idx;
39  for (; n < start_idx + numAppendElems; n++) {
40  size_t len = (*srcData)[replicating ? 0 : n].length();
41  if (dataSize + len > byteLimit) {
42  break;
43  }
44  dataSize += len;
45  }
46  return n - start_idx;
47 }
constexpr double n
Definition: Utm.h:38

+ Here is the caller graph for this function:

std::string_view StringNoneEncoder::getStringAtIndex ( const int8_t *  index_data,
const int8_t *  data,
size_t  index 
)
static

Definition at line 225 of file StringNoneEncoder.cpp.

References getStringOffsets(), and last_offset.

Referenced by appendEncodedData(), appendEncodedDataAtIndices(), and data_conversion::StringViewSource::getSourceData().

227  {
228  auto [offset, last_offset] = getStringOffsets(index_data, index);
229  size_t string_byte_size = offset - last_offset;
230  auto current_data = reinterpret_cast<const char*>(data + last_offset);
231  return std::string_view{current_data, string_byte_size};
232 }
StringOffsetT last_offset
static std::pair< StringOffsetT, StringOffsetT > getStringOffsets(const int8_t *index_data, size_t index)

+ Here is the call graph for this function:

+ Here is the caller graph for this function:

std::pair< StringOffsetT, StringOffsetT > StringNoneEncoder::getStringOffsets ( const int8_t *  index_data,
size_t  index 
)
staticprivate

Definition at line 207 of file StringNoneEncoder.cpp.

References CHECK, and last_offset.

Referenced by getStringAtIndex(), and getStringSizeAtIndex().

209  {
210  auto string_offsets = reinterpret_cast<const StringOffsetT*>(index_data);
211  auto current_index = index + 1;
212  auto offset = string_offsets[current_index];
213  CHECK(offset >= 0);
214  int64_t last_offset = string_offsets[current_index - 1];
215  CHECK(last_offset >= 0 && last_offset <= offset);
216  return {offset, last_offset};
217 }
int32_t StringOffsetT
Definition: sqltypes.h:1493
StringOffsetT last_offset
#define CHECK(condition)
Definition: Logger.h:291

+ Here is the caller graph for this function:

size_t StringNoneEncoder::getStringSizeAtIndex ( const int8_t *  index_data,
size_t  index 
)
staticprivate

Definition at line 219 of file StringNoneEncoder.cpp.

References getStringOffsets(), and last_offset.

Referenced by getNumElemsForBytesEncodedDataAtIndices().

219  {
220  auto [offset, last_offset] = getStringOffsets(index_data, index);
221  size_t string_byte_size = offset - last_offset;
222  return string_byte_size;
223 }
StringOffsetT last_offset
static std::pair< StringOffsetT, StringOffsetT > getStringOffsets(const int8_t *index_data, size_t index)

+ Here is the call graph for this function:

+ Here is the caller graph for this function:

void StringNoneEncoder::readMetadata ( FILE *  f)
inlineoverridevirtual

Implements Encoder.

Definition at line 113 of file StringNoneEncoder.h.

References CHECK_NE, has_nulls, and Encoder::num_elems_.

113  {
114  // assumes pointer is already in right place
115  CHECK_NE(fread((int8_t*)&num_elems_, sizeof(size_t), size_t(1), f), size_t(0));
116  CHECK_NE(fread((int8_t*)&has_nulls, sizeof(bool), size_t(1), f), size_t(0));
117  }
size_t num_elems_
Definition: Encoder.h:288
#define CHECK_NE(x, y)
Definition: Logger.h:302
torch::Tensor f(torch::Tensor x, torch::Tensor W_target, torch::Tensor b_target)
void StringNoneEncoder::reduceStats ( const Encoder )
inlineoverridevirtual

Implements Encoder.

Definition at line 105 of file StringNoneEncoder.h.

References CHECK.

105 { CHECK(false); }
#define CHECK(condition)
Definition: Logger.h:291
bool StringNoneEncoder::resetChunkStats ( const ChunkStats )
inlineoverridevirtual

: Reset chunk level stats (min, max, nulls) using new values from the argument.

Returns
: True if an update occurred and the chunk needs to be flushed. False otherwise. Default false if metadata update is unsupported. Only reset chunk stats if the incoming stats differ from the current stats.

Reimplemented from Encoder.

Definition at line 127 of file StringNoneEncoder.h.

References ChunkStats::has_nulls, and has_nulls.

127  {
128  if (has_nulls == stats.has_nulls) {
129  return false;
130  }
131  has_nulls = stats.has_nulls;
132  return true;
133  }
dictionary stats
Definition: report.py:116
void StringNoneEncoder::resetChunkStats ( )
inlineoverridevirtual

Resets chunk metadata stats to their default values.

Implements Encoder.

Definition at line 135 of file StringNoneEncoder.h.

References has_nulls.

135 { has_nulls = false; }
void StringNoneEncoder::setIndexBuffer ( AbstractBuffer buf)
inline

Definition at line 125 of file StringNoneEncoder.h.

References index_buf.

Referenced by Chunk_NS::Chunk::initEncoder().

125 { index_buf = buf; }
AbstractBuffer * index_buf

+ Here is the caller graph for this function:

template<typename StringType >
void StringNoneEncoder::update_elem_stats ( const StringType &  elem)
private

Definition at line 201 of file StringNoneEncoder.cpp.

References has_nulls.

Referenced by appendData(), and updateStats().

201  {
202  if (!has_nulls && elem.empty()) {
203  has_nulls = true;
204  }
205 }

+ Here is the caller graph for this function:

void StringNoneEncoder::updateStats ( const int64_t  ,
const bool   
)
inlineoverridevirtual

Implements Encoder.

Definition at line 87 of file StringNoneEncoder.h.

References CHECK.

87 { CHECK(false); }
#define CHECK(condition)
Definition: Logger.h:291
void StringNoneEncoder::updateStats ( const double  ,
const bool   
)
inlineoverridevirtual

Implements Encoder.

Definition at line 89 of file StringNoneEncoder.h.

References CHECK.

89 { CHECK(false); }
#define CHECK(condition)
Definition: Logger.h:291
void StringNoneEncoder::updateStats ( const int8_t *const  src_data,
const size_t  num_elements 
)
inlineoverridevirtual

Update statistics for data without appending.

Parameters
src_data- the data with which to update statistics
num_elements- the number of elements to scan in the data

Implements Encoder.

Definition at line 91 of file StringNoneEncoder.h.

References UNREACHABLE.

91  {
92  UNREACHABLE();
93  }
#define UNREACHABLE()
Definition: Logger.h:338
void StringNoneEncoder::updateStats ( const std::vector< std::string > *const  src_data,
const size_t  start_idx,
const size_t  num_elements 
)
overridevirtual

Update statistics for string data without appending.

Parameters
src_data- the string data with which to update statistics
start_idx- the offset into src_data to start the update
num_elements- the number of elements to scan in the string data

Implements Encoder.

Definition at line 189 of file StringNoneEncoder.cpp.

References has_nulls, anonymous_namespace{Utm.h}::n, and update_elem_stats().

191  {
192  for (size_t n = start_idx; n < start_idx + num_elements; n++) {
193  update_elem_stats((*src_data)[n]);
194  if (has_nulls) {
195  break;
196  }
197  }
198 }
void update_elem_stats(const StringType &elem)
constexpr double n
Definition: Utm.h:38

+ Here is the call graph for this function:

void StringNoneEncoder::updateStats ( const std::vector< ArrayDatum > *const  src_data,
const size_t  start_idx,
const size_t  num_elements 
)
inlineoverridevirtual

Update statistics for array data without appending.

Parameters
src_data- the array data with which to update statistics
start_idx- the offset into src_data to start the update
num_elements- the number of elements to scan in the array data

Implements Encoder.

Definition at line 99 of file StringNoneEncoder.h.

References UNREACHABLE.

101  {
102  UNREACHABLE();
103  }
#define UNREACHABLE()
Definition: Logger.h:338
void StringNoneEncoder::writeMetadata ( FILE *  f)
inlineoverridevirtual

Implements Encoder.

Definition at line 107 of file StringNoneEncoder.h.

References has_nulls, and Encoder::num_elems_.

107  {
108  // assumes pointer is already in right place
109  fwrite((int8_t*)&num_elems_, sizeof(size_t), 1, f);
110  fwrite((int8_t*)&has_nulls, sizeof(bool), 1, f);
111  }
size_t num_elems_
Definition: Encoder.h:288
torch::Tensor f(torch::Tensor x, torch::Tensor W_target, torch::Tensor b_target)

Member Data Documentation

AbstractBuffer* StringNoneEncoder::index_buf
private

Definition at line 148 of file StringNoneEncoder.h.

Referenced by appendData(), getIndexBuf(), and setIndexBuffer().

StringOffsetT StringNoneEncoder::last_offset
private

The documentation for this class was generated from the following files: