OmniSciDB  c1a53651b2
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Macros Groups Pages
FlatBuffer.h
Go to the documentation of this file.
1 #pragma once
2 
3 /*
4  * Copyright 2022 HEAVY.AI, Inc.
5  *
6  * Licensed under the Apache License, Version 2.0 (the "License");
7  * you may not use this file except in compliance with the License.
8  * You may obtain a copy of the License at
9  *
10  * http://www.apache.org/licenses/LICENSE-2.0
11  *
12  * Unless required by applicable law or agreed to in writing, software
13  * distributed under the License is distributed on an "AS IS" BASIS,
14  * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
15  * See the License for the specific language governing permissions and
16  * limitations under the License.
17  */
18 
19 // clang-format off
20 /*
21  FlatBufferManager provides a storage for a collection of buffers
22  (columns of arrays, columns of strings, etc) that are collected into
23  a single flat buffer so that copying FlatBuffer instances becomes a
24  single buffer copy operation. Flat buffers that store no pointer
25  values can be straightforwardly copied in between different devices.
26 
27  FlatBuffer memory layout specification
28  --------------------------------------
29 
30  The first 8 bytes of the buffer contains a FlatBuffer storage format
31  id (see FlatBufferFormat below) that will determine how the rest of
32  the bytes in the flat buffer will be interpreted. The next 8 bytes
33  of the buffer contains the total size of the flat buffer --- this
34  allows flat buffers passed around by a single pointer value without
35  explicitly specifying the size of the buffer.
36 
37  The memory layout of a flatbuffer is (using units of 8 byte):
38 
39  | <format id> | <flatbuffer size> | <data> ... | <format id> |
40  =
41  |<-- 8-bytes-->|<-- 8-bytes ------>|<-- flatbuffer size minus 24 bytes -->|<-- 8-bytes -->|
42  |<------------------------ flatbuffer size ---------------------------------------------->|
43 
44  where flatbuffer size is specified in bytes.
45 
46  VarlenArray format specification
47  --------------------------------
48 
49  If a FlatBuffer instance uses VarlenArray format (format id is
50  0x766c61) then the <data> part of the FlatBuffer is defined as follows:
51 
52  | <items count> | <dtype metadata kind/buffer size> | <max nof values> | <num specified items> | <offsets> | <varlen array data> |
53  =
54  |<--- 8-bytes -->|<--16-bytes ---------------------->|<-- 8-bytes ----->|<-- 8-bytes ---------->|<--24-bytes-->|<--flatbuffer size minus 96 bytes-->|
55  |<------------------------- flatbuffer size minus 24 bytes ---------------------------------------------------------------------------------------->|
56 
57  where
58 
59  <items count> is the number of items (e.g. varlen arrays) that the
60  flat buffer instance is holding. In the context of columns of
61  arrays, the items count is the number of rows in a column. This
62  is a user-specified parameter.
63 
64  <dtype metadata kind> defines the set of parameters that the dtype
65  of a value depends on. For instance, dtype typically is
66  characterized by its byte size. In the case of text encoding
67  dict, dtype also is parameterized by the string dictionary id.
68 
69  <dtype metadata buffer size> is the byte size of dtype metadata
70  buffer.
71 
72  <max nof values> is the maximum total number of elements in all
73  items that the flat buffer instance can hold. The value defines
74  the size of values buffer. User-specified parameter.
75 
76  <num specified items> is the number of items that has initial
77  value 0 and that is incremented by one on each setItem or
78  setNull call. The flat buffer is completely filled with items
79  when <num specified items> becomes equal to <items count>. Used
80  internally.
81 
82  <offsets> are precomputed offsets of data buffers. Used internally.
83 
84  | <dtype metadata offset> | <values offset> | <compressed_indices offset> | <storage indices offset> |
85  |<-- 8-bytes ------------>|<-- 8-bytes ---->|<-- 8-bytes ---------------->|<-- 8-bytes ------------->|
86 
87  <varlen array data> is
88 
89  | <dtype metadata buffer> | <values> | <compressed indices> | <storage indices> |
90  =
91  |<-- dtype metadata buffer size ->|<-- (max nof values) * 8 bytes -->|<-- (num items + 1) * 8 bytes-->|<-- (num items) * 8 bytes-->|
92  |<------------------------ flatbuffer size minus 88 bytes ------------------------------------------------------------------------>|
93 
94  and
95 
96  - values stores the elements of all items (e.g. the values of all
97  varlen arrays). Item elements are contiguous within an item,
98  however, the storage order of items can be arbitrary.
99 
100  - compressed indices contains the "cumulative sum" of storage
101  indices. Negative entries indicate null items.
102 
103  - storage indices defines the order of specifying items in the flat
104  buffer.
105 
106  For the detailed description of values, compressed_indices, and
107  storage_indices, as well as how empty arrays and null arrays are
108  represented, see https://pearu.github.io/variable_length_arrays.html .
109 
110  FlatBuffer usage
111  ----------------
112 
113  FlatBuffer implements various methods for accessing its content for
114  retriving or storing data. These methods usually are provided as
115  pairs of safe and unsafe methods. Safe methods validate method
116  inputs and return success or error status depending on the
117  validation results. Unsafe methods (that names have "NoValidation"
118  suffix) performs no validation on the inputs and (almost) always
119  return the success status. Use unsafe methods (for efficency) only
120  when one can prove that the inputs are always valid (e.g. indices
121  are in the expected range, buffer memory allocation is sufficient
122  for storing data, etc). Otherwise, use safe methods that will lead
123  to predictable and non-server-crashing behaviour in the case of
124  possibly invalid input data.
125 */
126 // clang-format on
127 
128 #ifdef HAVE_TOSTRING
129 #include <ostream>
130 #endif
131 #include <string.h>
132 
133 #include "../../Shared/funcannotations.h"
134 
135 // Notice that the format value is used to recognize if a memory
136 // buffer uses some flat buffer format or not. To minimize chances for
137 // false positive test results, use a non-trival integer value when
138 // introducing new formats.
140  VarlenArray = 0x7661726c65634152, // hex repr of 'varlenAR'
141 };
142 
143 inline int64_t _align_to_int64(int64_t addr) {
144  addr += sizeof(int64_t) - 1;
145  return (int64_t)(((uint64_t)addr >> 3) << 3);
146 }
147 
149  enum Status {
150  Success = 0,
159  };
160 
162  FORMAT_ID = 0, // storage format id
163  FLATBUFFER_SIZE, // in bytes
164  ITEMS_COUNT, /* the number of items
165  (e.g. the number of
166  rows in a column) */
167  DTYPE_METADATA_KIND, /* the kind of dtype metadata */
168  DTYPE_METADATA_BUFFER_SIZE, /* the size of dtype metadata buffer */
169  MAX_NOF_VALUES, /* the upper bound to the
170  total number of values
171  in all items */
172  STORAGE_COUNT, /* the number of specified
173  items, incremented by
174  one on each
175  setNull/setItem call */
181  };
182 
183  int8_t* buffer;
184 
185  // Check if a buffer contains FlatBuffer formatted data
186  HOST DEVICE static bool isFlatBuffer(const void* buffer) {
187  if (buffer) {
188  // warning: assume that buffer size is at least 8 bytes
189  FlatBufferFormat header_format =
190  static_cast<FlatBufferFormat>(((int64_t*)buffer)[VarlenArrayHeader::FORMAT_ID]);
191  if (header_format == FlatBufferFormat::VarlenArray) {
192  int64_t flatbuffer_size = ((int64_t*)buffer)[VarlenArrayHeader::FLATBUFFER_SIZE];
193  if (flatbuffer_size > 0) {
194  FlatBufferFormat footer_format = static_cast<FlatBufferFormat>(
195  ((int64_t*)buffer)[flatbuffer_size / sizeof(int64_t) - 1]);
196  return footer_format == header_format;
197  }
198  }
199  }
200  return false;
201  }
202 
203  // Return the allocation size of the the FlatBuffer storage, in bytes
204  static int64_t getBufferSize(const void* buffer) {
205  if (isFlatBuffer(buffer)) {
206  return ((const int64_t*)(buffer))[VarlenArrayHeader::FLATBUFFER_SIZE];
207  } else {
208  return -1;
209  }
210  }
211 
212  // Return the format of FlatBuffer
214  return static_cast<FlatBufferFormat>(
215  ((int64_t*)buffer)[VarlenArrayHeader::FORMAT_ID]);
216  }
217 
218  // Return the allocation size of the the FlatBuffer storage, in bytes
219  inline int64_t flatbufferSize() const {
220  return ((const int64_t*)(buffer))[VarlenArrayHeader::FLATBUFFER_SIZE];
221  }
222 
223  /* DType MetaData support */
224 
226  SIZE = 1,
228  };
229 
231  int64_t size;
232  };
233 
235  int64_t size;
236  int32_t db_id;
237  int32_t dict_id;
238  };
239 
241  switch (kind) {
242  case DTypeMetadataKind::SIZE:
243  return _align_to_int64(sizeof(DTypeMetadataSize));
244  case DTypeMetadataKind::SIZE_DICTID:
246  default:
247  return 0;
248  }
249  }
250 
252  return static_cast<DTypeMetadataKind>(
253  ((int64_t*)buffer)[VarlenArrayHeader::DTYPE_METADATA_KIND]);
254  }
255 
256  inline int8_t* getDTypeMetadataBuffer() {
257  return buffer + ((int64_t*)buffer)[VarlenArrayHeader::DTYPE_METADATA_BUFFER_OFFSET];
258  }
259 
260  HOST DEVICE inline const int8_t* getDTypeMetadataBuffer() const {
261  return buffer +
262  ((const int64_t*)buffer)[VarlenArrayHeader::DTYPE_METADATA_BUFFER_OFFSET];
263  }
264 
265  // dtype size accessors:
266  void setDTypeMetadataSize(int64_t size) {
267  switch (getDTypeMetadataKind()) {
268  case DTypeMetadataKind::SIZE: {
269  DTypeMetadataSize* metadata =
270  reinterpret_cast<DTypeMetadataSize*>(getDTypeMetadataBuffer());
271  metadata->size = size;
272  } break;
273  case DTypeMetadataKind::SIZE_DICTID: {
274  DTypeMetadataSizeDictId* metadata =
275  reinterpret_cast<DTypeMetadataSizeDictId*>(getDTypeMetadataBuffer());
276  metadata->size = size;
277  } break;
278  default:;
279  }
280  }
281 
282  HOST DEVICE int64_t getDTypeMetadataSize() const {
283  switch (getDTypeMetadataKind()) {
284  case DTypeMetadataKind::SIZE: {
285  const DTypeMetadataSize* metadata =
286  reinterpret_cast<const DTypeMetadataSize*>(getDTypeMetadataBuffer());
287  return metadata->size;
288  }
289  case DTypeMetadataKind::SIZE_DICTID: {
290  const DTypeMetadataSizeDictId* metadata =
291  reinterpret_cast<const DTypeMetadataSizeDictId*>(getDTypeMetadataBuffer());
292  return metadata->size;
293  }
294  default:;
295  }
296  return 0;
297  }
298 
299  // dtype dict id accessors:
300  void setDTypeMetadataDictKey(int32_t db_id, int32_t dict_id) {
301  switch (getDTypeMetadataKind()) {
302  case DTypeMetadataKind::SIZE:
303  break;
304  case DTypeMetadataKind::SIZE_DICTID: {
305  DTypeMetadataSizeDictId* metadata =
306  reinterpret_cast<DTypeMetadataSizeDictId*>(getDTypeMetadataBuffer());
307  metadata->db_id = db_id;
308  metadata->dict_id = dict_id;
309  } break;
310  default:;
311  }
312  }
313 
314  int32_t getDTypeMetadataDictDbId() const {
315  switch (getDTypeMetadataKind()) {
316  case DTypeMetadataKind::SIZE:
317  break;
318  case DTypeMetadataKind::SIZE_DICTID: {
319  const DTypeMetadataSizeDictId* metadata =
320  reinterpret_cast<const DTypeMetadataSizeDictId*>(getDTypeMetadataBuffer());
321  return metadata->db_id;
322  }
323  default:;
324  }
325  return {};
326  }
327 
328  int32_t getDTypeMetadataDictId() const {
329  switch (getDTypeMetadataKind()) {
330  case DTypeMetadataKind::SIZE:
331  break;
332  case DTypeMetadataKind::SIZE_DICTID: {
333  const DTypeMetadataSizeDictId* metadata =
334  reinterpret_cast<const DTypeMetadataSizeDictId*>(getDTypeMetadataBuffer());
335  return metadata->dict_id;
336  }
337  default:;
338  }
339  return {};
340  }
341 
342  // VarlenArray support:
343  static int64_t get_VarlenArray_flatbuffer_size(int64_t items_count,
344  int64_t max_nof_values,
345  int64_t dtype_size,
346  DTypeMetadataKind dtype_metadata_kind) {
347  const int64_t VarlenArray_buffer_header_size =
348  VarlenArrayHeader::EOFHEADER * sizeof(int64_t);
349  const int64_t dtype_metadata_buffer_size =
350  getDTypeMetadataBufferSize(dtype_metadata_kind);
351  const int64_t values_buffer_size = _align_to_int64(dtype_size * max_nof_values);
352  return (VarlenArray_buffer_header_size // see above
353  + dtype_metadata_buffer_size // size of dtype metadata buffer in bytes,
354  // aligned to int64
355  + values_buffer_size // size of values buffer, aligned to int64
356  + (items_count + 1 // size of compressed_indices buffer
357  + items_count // size of storage_indices buffer
358  + 1 // footer format id
359  ) * sizeof(int64_t));
360  }
361 
362  // Initialize FlatBuffer for VarlenArray storage
363  void initializeVarlenArray(int64_t items_count,
364  int64_t max_nof_values,
365  int64_t dtype_size,
366  DTypeMetadataKind dtype_metadata_kind) {
367  const int64_t VarlenArray_buffer_header_size =
368  VarlenArrayHeader::EOFHEADER * sizeof(int64_t);
369  const int64_t values_buffer_size = _align_to_int64(dtype_size * max_nof_values);
370  const int64_t compressed_indices_buffer_size = (items_count + 1) * sizeof(int64_t);
371  const int64_t dtype_metadata_buffer_size =
372  getDTypeMetadataBufferSize(dtype_metadata_kind);
373  const int64_t flatbuffer_size = get_VarlenArray_flatbuffer_size(
374  items_count, max_nof_values, dtype_size, dtype_metadata_kind);
375  ((int64_t*)buffer)[VarlenArrayHeader::FORMAT_ID] =
376  static_cast<int64_t>(FlatBufferFormat::VarlenArray);
377  ((int64_t*)buffer)[VarlenArrayHeader::FLATBUFFER_SIZE] = flatbuffer_size;
378  ((int64_t*)buffer)[VarlenArrayHeader::ITEMS_COUNT] = items_count;
379  ((int64_t*)buffer)[VarlenArrayHeader::MAX_NOF_VALUES] = max_nof_values;
380  ((int64_t*)buffer)[VarlenArrayHeader::DTYPE_METADATA_KIND] =
381  static_cast<int64_t>(dtype_metadata_kind);
382  ((int64_t*)buffer)[VarlenArrayHeader::DTYPE_METADATA_BUFFER_SIZE] =
383  dtype_metadata_buffer_size;
384  ((int64_t*)buffer)[VarlenArrayHeader::STORAGE_COUNT] = 0;
385  ((int64_t*)buffer)[VarlenArrayHeader::DTYPE_METADATA_BUFFER_OFFSET] =
386  VarlenArray_buffer_header_size;
387  ((int64_t*)buffer)[VarlenArrayHeader::VALUES_OFFSET] =
388  ((const int64_t*)buffer)[VarlenArrayHeader::DTYPE_METADATA_BUFFER_OFFSET] +
389  dtype_metadata_buffer_size;
390  ((int64_t*)buffer)[VarlenArrayHeader::COMPRESSED_INDICES_OFFSET] =
391  ((const int64_t*)buffer)[VarlenArrayHeader::VALUES_OFFSET] + values_buffer_size;
392  ((int64_t*)buffer)[VarlenArrayHeader::STORAGE_INDICES_OFFSET] =
393  ((const int64_t*)buffer)[VarlenArrayHeader::COMPRESSED_INDICES_OFFSET] +
394  compressed_indices_buffer_size;
395 
396  setDTypeMetadataSize(dtype_size);
397 
398  // initialize indices buffers
399  int64_t* compressed_indices = VarlenArray_compressed_indices();
400  int64_t* storage_indices = VarlenArray_storage_indices();
401  for (int i = 0; i < items_count; i++) {
402  compressed_indices[i] = 0;
403  storage_indices[i] = -1;
404  }
405  compressed_indices[items_count] = 0;
406  // store footer format id in the last 8-bytes of the flatbuffer:
407  ((int64_t*)buffer)[flatbuffer_size / sizeof(int64_t) - 1] =
408  static_cast<int64_t>(FlatBufferFormat::VarlenArray);
409  }
410 
411  // Return the number of items
412  HOST DEVICE inline int64_t itemsCount() const {
414  return ((int64_t*)buffer)[VarlenArrayHeader::ITEMS_COUNT];
415  }
416  return -1; // invalid value
417  }
418 
419  // Return the size of dtype of the item elements, in bytes
420  HOST DEVICE inline int64_t dtypeSize() const { return getDTypeMetadataSize(); }
421 
422  // Return the upper bound to the total number of values in all items
423  inline int64_t VarlenArray_max_nof_values() const {
424  return ((int64_t*)buffer)[VarlenArrayHeader::MAX_NOF_VALUES];
425  }
426 
427  // Return the total number of values in all specified items
428  inline int64_t VarlenArray_nof_values() const {
429  const int64_t storage_count = VarlenArray_storage_count();
430  const int64_t* compressed_indices = VarlenArray_compressed_indices();
431  return compressed_indices[storage_count];
432  }
433 
434  // Return the size of values buffer in bytes
435  inline int64_t VarlenArray_values_buffer_size() const {
437  }
438 
439  // Return the number of specified items
440  inline int64_t& VarlenArray_storage_count() {
441  return ((int64_t*)buffer)[VarlenArrayHeader::STORAGE_COUNT];
442  }
443 
444  inline int64_t& VarlenArray_storage_count() const {
445  return ((int64_t*)buffer)[VarlenArrayHeader::STORAGE_COUNT];
446  }
447 
448  // Return the pointer to values buffer
449  HOST DEVICE inline int8_t* VarlenArray_values() {
450  return buffer + ((const int64_t*)buffer)[VarlenArrayHeader::VALUES_OFFSET];
451  }
452 
453  inline const int8_t* VarlenArray_values() const {
454  return buffer + ((const int64_t*)buffer)[VarlenArrayHeader::VALUES_OFFSET];
455  }
456 
457  // Return the pointer to compressed indices buffer
459  return reinterpret_cast<int64_t*>(
460  buffer + ((const int64_t*)buffer)[VarlenArrayHeader::COMPRESSED_INDICES_OFFSET]);
461  }
462  inline const int64_t* VarlenArray_compressed_indices() const {
463  return reinterpret_cast<const int64_t*>(
464  buffer + ((const int64_t*)buffer)[VarlenArrayHeader::COMPRESSED_INDICES_OFFSET]);
465  }
466 
467  // Return the pointer to storage indices buffer
469  return reinterpret_cast<int64_t*>(
470  buffer + ((const int64_t*)buffer)[VarlenArrayHeader::STORAGE_INDICES_OFFSET]);
471  }
472  inline const int64_t* VarlenArray_storage_indices() const {
473  return reinterpret_cast<const int64_t*>(
474  buffer + ((const int64_t*)buffer)[VarlenArrayHeader::STORAGE_INDICES_OFFSET]);
475  }
476 
477  // Set a new item with index and size (in bytes) and initialize its
478  // elements from source buffer. The item values will be
479  // uninitialized when source buffer is nullptr. If dest != nullptr
480  // then the item's buffer pointer will be stored in *dest.
481  Status setItem(int64_t index,
482  const int8_t* src,
483  int64_t size,
484  int8_t** dest = nullptr) {
486  if (index < 0 || index >= itemsCount()) {
487  return IndexError;
488  }
489  int64_t& storage_count = VarlenArray_storage_count();
490  int64_t* compressed_indices = VarlenArray_compressed_indices();
491  int64_t* storage_indices = VarlenArray_storage_indices();
492  const int64_t itemsize = dtypeSize();
493  if (size % itemsize != 0) {
494  return SizeError; // size must be multiple of itemsize. Perhaps size is not in
495  // bytes?
496  }
497  if (storage_indices[index] >= 0) {
499  }
500  const int64_t cindex = compressed_indices[storage_count];
501  const int64_t values_buffer_size = VarlenArray_values_buffer_size();
502  const int64_t csize = cindex * itemsize;
503  if (csize + size > values_buffer_size) {
505  }
506  return setItemNoValidation(index, src, size, dest);
507  }
508  return UnknownFormatError;
509  }
510 
511  // Same as setItem but performs no input validation
513  const int8_t* src,
514  int64_t size,
515  int8_t** dest) {
516  int64_t& storage_count = VarlenArray_storage_count();
517  int64_t* storage_indices = VarlenArray_storage_indices();
518  int64_t* compressed_indices = VarlenArray_compressed_indices();
519  int8_t* values = VarlenArray_values();
520  const int64_t itemsize = dtypeSize();
521  const int64_t values_count = size / itemsize;
522  const int64_t cindex = compressed_indices[storage_count];
523  const int64_t csize = cindex * itemsize;
524  storage_indices[index] = storage_count;
525  compressed_indices[storage_count + 1] = cindex + values_count;
526  if (size > 0 && src != nullptr && memcpy(values + csize, src, size) == nullptr) {
527  return MemoryError;
528  }
529  if (dest != nullptr) {
530  *dest = values + csize;
531  }
532  storage_count++;
533  return Success;
534  }
535 
536  // Set a new item with index and size but without initializing item
537  // elements. The buffer pointer of the new item will be stored in
538  // *dest if dest != nullptr. Inputs are not validated!
539  Status setEmptyItemNoValidation(int64_t index, int64_t size, int8_t** dest) {
540  return setItemNoValidation(index, nullptr, size, dest);
541  }
542 
543  Status concatItem(int64_t index, const int8_t* src, int64_t size) {
545  if (index < 0 || index >= itemsCount()) {
546  return IndexError;
547  }
548  int64_t next_storage_count = VarlenArray_storage_count();
549  int64_t storage_count = next_storage_count - 1;
550  int64_t* compressed_indices = VarlenArray_compressed_indices();
551  int64_t* storage_indices = VarlenArray_storage_indices();
552  int8_t* values = VarlenArray_values();
553  int64_t itemsize = dtypeSize();
554  int64_t storage_index = storage_indices[index];
555 
556  if (storage_index == -1) { // unspecified, so setting the item
557  return setItem(index, src, size, nullptr);
558  }
559  if (size % itemsize != 0) {
560  return SizeError;
561  }
562  if (storage_index != storage_count) {
563  return IndexError; // index does not correspond to the last set
564  // item, only the last item can be
565  // concatenated
566  }
567  if (compressed_indices[storage_index] < 0) {
568  return NotImplemnentedError; // todo: support concat to null when last
569  }
570  int64_t values_count =
571  compressed_indices[next_storage_count] - compressed_indices[storage_index];
572  int64_t extra_values_count = size / itemsize;
573  compressed_indices[next_storage_count] += extra_values_count;
574  int8_t* ptr = values + compressed_indices[storage_index] * itemsize;
575  if (size > 0 && src != nullptr &&
576  memcpy(ptr + values_count * itemsize, src, size) == nullptr) {
577  return MemoryError;
578  }
579  return Success;
580  }
581  return UnknownFormatError;
582  }
583 
584  // Set item with index as a null item
585  Status setNull(int64_t index) {
587  if (index < 0 || index >= itemsCount()) {
588  return IndexError;
589  }
590  int64_t* storage_indices = VarlenArray_storage_indices();
591  if (storage_indices[index] >= 0) {
593  }
594  return setNullNoValidation(index);
595  }
596  return UnknownFormatError;
597  }
598 
599  // Same as setNull but performs no input validation
600  Status setNullNoValidation(int64_t index) {
601  int64_t& storage_count = VarlenArray_storage_count();
602  int64_t* storage_indices = VarlenArray_storage_indices();
603  int64_t* compressed_indices = VarlenArray_compressed_indices();
604  const int64_t cindex = compressed_indices[storage_count];
605  storage_indices[index] = storage_count;
606  compressed_indices[storage_count] = -(cindex + 1);
607  compressed_indices[storage_count + 1] = cindex;
608  storage_count++;
609  return Success;
610  }
611 
612  // Check if the item is unspecified or null.
613  Status isNull(int64_t index, bool& is_null) const {
615  if (index < 0 || index >= itemsCount()) {
616  return IndexError;
617  }
618  const int64_t* compressed_indices = VarlenArray_compressed_indices();
619  const int64_t* storage_indices = VarlenArray_storage_indices();
620  const int64_t storage_index = storage_indices[index];
621  if (storage_index < 0) {
622  return ItemUnspecifiedError;
623  }
624  is_null = (compressed_indices[storage_index] < 0);
625  return Success;
626  }
627  return UnknownFormatError;
628  }
629 
630  // Get item at index by storing its size (in bytes), values buffer,
631  // and nullity information to the corresponding pointer
632  // arguments.
633  HOST DEVICE Status getItem(int64_t index, int64_t& size, int8_t*& dest, bool& is_null) {
635  if (index < 0 || index >= itemsCount()) {
636  return IndexError;
637  }
638  int8_t* values = VarlenArray_values();
639  const int64_t* compressed_indices = VarlenArray_compressed_indices();
640  const int64_t* storage_indices = VarlenArray_storage_indices();
641  const int64_t storage_index = storage_indices[index];
642  if (storage_index < 0) {
643  return ItemUnspecifiedError;
644  }
645  const int64_t cindex = compressed_indices[storage_index];
646  if (cindex < 0) {
647  // null varlen array
648  size = 0;
649  dest = nullptr;
650  is_null = true;
651  } else {
652  const int64_t dtypesize = dtypeSize();
653  const int64_t next_cindex = compressed_indices[storage_index + 1];
654  const int64_t length =
655  (next_cindex < 0 ? -(next_cindex + 1) - cindex : next_cindex - cindex);
656  size = length * dtypesize;
657  dest = values + cindex * dtypesize;
658  is_null = false;
659  }
660  return Success;
661  } else {
662  return UnknownFormatError;
663  }
664  }
665 
666  HOST DEVICE Status getItem(int64_t index, size_t& size, int8_t*& dest, bool& is_null) {
667  int64_t sz{0};
668  Status status = getItem(index, sz, dest, is_null);
669  size = sz;
670  return status;
671  }
672 
673 #ifdef HAVE_TOSTRING
674  std::string toString() const {
675  if (buffer == nullptr) {
676  return ::typeName(this) + "[UNINITIALIZED]";
677  }
678  switch (format()) {
680  std::string result = typeName(this) + "(";
681  int64_t numvalues = VarlenArray_nof_values();
682  int64_t numitems = itemsCount();
683  int64_t itemsize = getDTypeMetadataSize();
684 
685  const int64_t* buf = reinterpret_cast<const int64_t*>(buffer);
686  std::vector<int64_t> header(buf, buf + 11);
687  result += "header=" + ::toString(header);
688 
689  const int64_t* metadata_buf =
690  reinterpret_cast<const int64_t*>(getDTypeMetadataBuffer());
691  int64_t metadata_buf_size = getDTypeMetadataBufferSize(getDTypeMetadataKind());
692  std::vector<int64_t> metadata(metadata_buf, metadata_buf + metadata_buf_size);
693  result += ", metadata=" + ::toString(metadata);
694 
695  result += ", values=";
696  switch (itemsize) {
697  case 1: {
698  const int8_t* values_buf = VarlenArray_values();
699  std::vector<int8_t> values(values_buf, values_buf + numvalues);
700  result += ::toString(values);
701  } break;
702  case 2: {
703  const int16_t* values_buf =
704  reinterpret_cast<const int16_t*>(VarlenArray_values());
705  std::vector<int16_t> values(values_buf, values_buf + numvalues);
706  result += ::toString(values);
707  } break;
708  case 4: {
709  const int32_t* values_buf =
710  reinterpret_cast<const int32_t*>(VarlenArray_values());
711  std::vector<int32_t> values(values_buf, values_buf + numvalues);
712  result += ::toString(values);
713  } break;
714  case 8: {
715  const int64_t* values_buf =
716  reinterpret_cast<const int64_t*>(VarlenArray_values());
717  std::vector<int64_t> values(values_buf, values_buf + numvalues);
718  result += ::toString(values);
719  } break;
720  default:
721  result += "[UNEXPECTED ITEMSIZE:" + std::to_string(itemsize) + "]";
722  }
723 
724  const int64_t* compressed_indices_buf = VarlenArray_compressed_indices();
725  std::vector<int64_t> compressed_indices(compressed_indices_buf,
726  compressed_indices_buf + numitems + 1);
727  result += ", compressed_indices=" + ::toString(compressed_indices);
728 
729  const int64_t* storage_indices_buf = VarlenArray_storage_indices();
730  std::vector<int64_t> storage_indices(storage_indices_buf,
731  storage_indices_buf + numitems);
732  result += ", storage_indices=" + ::toString(storage_indices);
733 
734  result += ")";
735  return result;
736  }
737  default:;
738  }
739  return ::typeName(this) + "[UNKNOWN FORMAT]";
740  }
741 #endif
742 };
743 
744 #ifdef HAVE_TOSTRING
745 inline std::ostream& operator<<(std::ostream& os,
746  FlatBufferManager::Status const status) {
747  switch (status) {
749  os << "Success";
750  break;
752  os << "IndexError";
753  break;
755  os << "SizeError";
756  break;
758  os << "ItemAlreadySpecifiedError";
759  break;
761  os << "ItemUnspecifiedError";
762  break;
764  os << "ValuesBufferTooSmallError";
765  break;
767  os << "MemoryError";
768  break;
770  os << "UnknownFormatError";
771  break;
773  os << "NotImplemnentedError";
774  break;
775  default:
776  os << "[Unknown FlatBufferManager::Status value]";
777  }
778  return os;
779 }
780 
781 inline std::string toString(const FlatBufferManager::Status& status) {
782  std::ostringstream ss;
783  ss << status;
784  return ss.str();
785 }
786 #endif
void initializeVarlenArray(int64_t items_count, int64_t max_nof_values, int64_t dtype_size, DTypeMetadataKind dtype_metadata_kind)
Definition: FlatBuffer.h:363
int32_t getDTypeMetadataDictDbId() const
Definition: FlatBuffer.h:314
HOST DEVICE int64_t getDTypeMetadataSize() const
Definition: FlatBuffer.h:282
HOST DEVICE int64_t dtypeSize() const
Definition: FlatBuffer.h:420
int32_t getDTypeMetadataDictId() const
Definition: FlatBuffer.h:328
Status isNull(int64_t index, bool &is_null) const
Definition: FlatBuffer.h:613
Status concatItem(int64_t index, const int8_t *src, int64_t size)
Definition: FlatBuffer.h:543
std::ostream & operator<<(std::ostream &os, const SessionInfo &session_info)
Definition: SessionInfo.cpp:57
Status setItem(int64_t index, const int8_t *src, int64_t size, int8_t **dest=nullptr)
Definition: FlatBuffer.h:481
static int64_t get_VarlenArray_flatbuffer_size(int64_t items_count, int64_t max_nof_values, int64_t dtype_size, DTypeMetadataKind dtype_metadata_kind)
Definition: FlatBuffer.h:343
HOST DEVICE FlatBufferFormat format() const
Definition: FlatBuffer.h:213
const int64_t * VarlenArray_storage_indices() const
Definition: FlatBuffer.h:472
std::string to_string(char const *&&v)
#define DEVICE
#define HOST
HOST DEVICE Status getItem(int64_t index, int64_t &size, int8_t *&dest, bool &is_null)
Definition: FlatBuffer.h:633
int64_t & VarlenArray_storage_count()
Definition: FlatBuffer.h:440
CONSTEXPR DEVICE bool is_null(const T &value)
HOST DEVICE Status getItem(int64_t index, size_t &size, int8_t *&dest, bool &is_null)
Definition: FlatBuffer.h:666
HOST DEVICE const int8_t * getDTypeMetadataBuffer() const
Definition: FlatBuffer.h:260
const int64_t * VarlenArray_compressed_indices() const
Definition: FlatBuffer.h:462
int64_t VarlenArray_max_nof_values() const
Definition: FlatBuffer.h:423
int64_t VarlenArray_values_buffer_size() const
Definition: FlatBuffer.h:435
std::string toString(const ExecutorDeviceType &device_type)
HOST DEVICE int64_t itemsCount() const
Definition: FlatBuffer.h:412
int8_t * getDTypeMetadataBuffer()
Definition: FlatBuffer.h:256
Status setNull(int64_t index)
Definition: FlatBuffer.h:585
Status setEmptyItemNoValidation(int64_t index, int64_t size, int8_t **dest)
Definition: FlatBuffer.h:539
int64_t & VarlenArray_storage_count() const
Definition: FlatBuffer.h:444
HOST DEVICE int64_t * VarlenArray_compressed_indices()
Definition: FlatBuffer.h:458
int64_t VarlenArray_nof_values() const
Definition: FlatBuffer.h:428
std::string typeName(const T *v)
Definition: toString.h:103
HOST DEVICE int64_t * VarlenArray_storage_indices()
Definition: FlatBuffer.h:468
int64_t flatbufferSize() const
Definition: FlatBuffer.h:219
int64_t _align_to_int64(int64_t addr)
Definition: FlatBuffer.h:143
void setDTypeMetadataDictKey(int32_t db_id, int32_t dict_id)
Definition: FlatBuffer.h:300
HOST DEVICE int8_t * VarlenArray_values()
Definition: FlatBuffer.h:449
HOST static DEVICE bool isFlatBuffer(const void *buffer)
Definition: FlatBuffer.h:186
static int64_t getDTypeMetadataBufferSize(DTypeMetadataKind kind)
Definition: FlatBuffer.h:240
Status setNullNoValidation(int64_t index)
Definition: FlatBuffer.h:600
void setDTypeMetadataSize(int64_t size)
Definition: FlatBuffer.h:266
HOST DEVICE DTypeMetadataKind getDTypeMetadataKind() const
Definition: FlatBuffer.h:251
FlatBufferFormat
Definition: FlatBuffer.h:139
const int8_t * VarlenArray_values() const
Definition: FlatBuffer.h:453
Status setItemNoValidation(int64_t index, const int8_t *src, int64_t size, int8_t **dest)
Definition: FlatBuffer.h:512
static int64_t getBufferSize(const void *buffer)
Definition: FlatBuffer.h:204