OmniSciDB  72c90bc290
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Macros Groups Pages
FlatBuffer.h
Go to the documentation of this file.
1 #pragma once
2 
3 /*
4  * Copyright 2022 HEAVY.AI, Inc.
5  *
6  * Licensed under the Apache License, Version 2.0 (the "License");
7  * you may not use this file except in compliance with the License.
8  * You may obtain a copy of the License at
9  *
10  * http://www.apache.org/licenses/LICENSE-2.0
11  *
12  * Unless required by applicable law or agreed to in writing, software
13  * distributed under the License is distributed on an "AS IS" BASIS,
14  * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
15  * See the License for the specific language governing permissions and
16  * limitations under the License.
17  */
18 
19 // clang-format off
20 /*
21  FlatBufferManager provides a storage for a collection of buffers
22  (columns of arrays, columns of strings, etc) that are collected into
23  a single "flatbuffer" so that copying FlatBuffer instances becomes a
24  single buffer copy operation. Flat buffers that store no pointer
25  values can be straightforwardly copied in between different devices.
26 
27  FlatBuffer memory layout specification
28  --------------------------------------
29 
30  The first and the last 8 bytes of the buffer contains a FlatBuffer
31  storage format id (see FlatBufferFormat below) that will determine
32  how the rest of the memory in the flatbuffer will be interpreted.
33 
34  The next 8 bytes contains the total size of the flatbuffer --- this
35  allows flatbuffers passed around by a single pointer value without
36  explicitly specifying the size of the buffer as well as checking if
37  the an arbitrary memory buffer uses the flatbuffer format or not.
38 
39  At the 16 bytes, starts the data format metadata buffer that holds a
40  format specific struct instance that contains various user-specified
41  parameters of the storage format. The size of the data format
42  metadata buffer depends on the format id.
43 
44  After the data format metadata buffer starts the data format worker
45  buffer that holds various pre-computed and state parameters that
46  describe the internal data layout of the format. The size of the
47  data format metadata buffer depends on the format id.
48 
49  After the data format worker buffer starts the raw data buffer that
50  interpretation depends on the format parameters specified above. The
51  size of the raw data buffer depends on the format id and the
52  user-specified parameters in the data format metadata.
53  All buffers above are aligned to the 64-bit boundaries.
54 
55  In summary, the memory layout of a flatbuffer is:
56 
57  | <format id> | <flatbuffer size> | <format metadata buffer> | <format worker buffer> | <raw data buffer> | <format id> |
58  =
59  |<-- 8-bytes-->|<-- 8-bytes ------>|<-- metadata buffer size -->|<-- worker buffer size -->|<-- row data size -->|<-- 8-bytes -->|
60  |<------------------------ flatbuffer size ------------------------------------------------------------------------------------->|
61 
62  GeoPoint format specification
63  -----------------------------
64 
65  The GeoPoint format metadata and worker buffers memory layout is
66  described by GeoPoint and GeoPointWorker struct definitions
67  below. The raw data buffer memory layout is a follows:
68 
69  | <point data> |
70  =
71  |<-- (num items) * (is_geoint ? 4 : 8) bytes -->|
72 
73  where <point data> stores points coordinates in a point-wise manner:
74  X0, Y0, X1, Y1, ...
75 
76  if is_geoint is true, point coordinates are stored as integers,
77  otherwise as double floating point numbers.
78 
79  NestedArray format specification
80  --------------------------------
81 
82  NestedArray represents a storage for zero, one, two, and three
83  dimensional ragged arrays. The storage format consists of sizes and
84  values buffers (plus offset buffers to optimize accessing
85  items). The sizes buffer stores the sizes of ragged arrays at
86  various levels and the values buffer stores the values of ragged
87  arrays.
88 
89  The NestedArray storage is used as a uniform storage schema for
90  different types (variable-length arrays, geotypes, etc) with
91  variable dimensionality. For example, a GeoMultiPolygon
92 
93  GeoMultiPolygon([
94  GeoPolygon([LineString([(x000, y000), (x001, y001), ...])],
95  LineString([(x010, y010), (x011, y011), ...])],
96  ...]),
97  GeoPolygon([LineString([(x100, y100), (x101, y101), ...])],
98  LineString([(x110, y110), (x111, y111), ...])],
99  ...]),
100  ...
101  ])
102 
103  is represented as a three dimensional ragged array where the sizes
104  buffer contains the number of polygons in the multi-polygon, all the
105  numbers of linestrings in polygons, all the numbers of points in
106  linestrings, and finally, the values buffer contains all the
107  coordinates. Note that a "value" is defined as a point with two
108  coordinates.
109 
110  The current implementation of NestedArray supports dimensionalities
111  up to 3 but the format can be extended to arbitrary dimensions.
112 
113  NestedArray API
114  ---------------
115 
116  To compute flatbuffer size required to represent a nested array with
117  the given dimensionsinality, total items count, estimated total
118  sizes and values counts, value type, and user data buffer size,
119  use::
120 
121  int64_t compute_flatbuffer_size(ndims,
122  total_items_count,
123  total_sizes_count,
124  total_values_count,
125  value_type,
126  user_data_size)
127 
128  To initialize the provided buffer for nested array format, use::
129 
130  Status .initialize(ndims,
131  total_items_count,
132  total_sizes_count,
133  total_values_count,
134  value_type,
135  null_value_ptr,
136  user_data_ptr, user_data_size)
137 
138  To test if the provided buffer contains an initialized FlatBuffer::
139 
140  bool isFlatBuffer(buffer)
141 
142  To get the size of an initialized FlatBuffer::
143 
144  int64_t getBufferSize(buffer)
145  int64_t .getBufferSize()
146 
147  To get the size of the values buffer::
148 
149  size_t .getValuesBufferSize()
150 
151  To get the size of a value::
152 
153  size_t .getValueSize()
154 
155  To get the number of specified values::
156 
157  size_t .getValuesCount()
158 
159  To get the dimensionality of a nested array::
160 
161  size_t .getNDims()
162 
163  To get various buffers::
164 
165  int8_t* .get_user_data_buffer()
166  int8_t* .get_values_buffer()
167  int8_t* .getValuesBuffer()
168  sizes_t* .get_sizes_buffer()
169  offsets_t* .get_values_offsets()
170  offsets_t* .get_sizes_offsets()
171  int8_t* .getNullValuePtr()
172 
173  To test if the provided buffer contains null value::
174 
175  bool .containsNullValue()
176 
177  To get the item (and subitems) of a nested array::
178 
179  template <size_t NDIM>
180  Status getItemWorker(const int64_t index[NDIM],
181  const size_t n,
182  int8_t*& values,
183  int32_t& nof_values,
184  int32_t* sizes_buffers[NDIM],
185  int32_t sizes_lengths[NDIM],
186  int32_t& nof_sizes,
187  bool& is_null)
188 
189  template <size_t NDIM>
190  Status getItem(const int64_t index, NestedArrayItem<NDIM>& result)
191 
192  template <size_t NDIM>
193  Status getItem(const int64_t index[NDIM], const size_t n, NestedArrayItem<NDIM>& result)
194 
195  template <typename CT>
196  Status getItem(const int64_t index,
197  std::vector<CT>& values,
198  std::vector<int32_t>& sizes,
199  bool& is_null)
200 
201  template <typename CT>
202  Status getItem(const int64_t index,
203  std::vector<CT>& values,
204  std::vector<int32_t>& sizes,
205  std::vector<int32_t>& sizes_of_sizes,
206  bool& is_null)
207 
208 
209 
210  To set the item of a nested array::
211 
212  template <size_t NDIM, bool check_sizes=true>
213  Status setItemWorker(const int64_t index,
214  const int8_t* values,
215  const int32_t nof_values,
216  const int32_t* const sizes_buffers[NDIM],
217  const int32_t sizes_lengths[NDIM],
218  const int32_t nof_sizes)
219 
220  template <size_t NDIM=0, bool check_sizes=true>
221  Status setItem(const int64_t index,
222  const int8_t* values_buf,
223  const int32_t nof_values)
224 
225  template <size_t NDIM=1, bool check_sizes=true>
226  Status setItem(const int64_t index,
227  const int8_t* values_buf,
228  const int32_t nof_values,
229  const int32_t* sizes_buf,
230  const int32_t nof_sizes)
231 
232  template <size_t NDIM=2, bool check_sizes=true>
233  Status setItem(const int64_t index,
234  const int8_t* values_buf,
235  const int32_t nof_values,
236  const int32_t* sizes_buf,
237  const int32_t nof_sizes,
238  const int32_t* sizes_of_sizes_buf,
239  const int32_t nof_sizes_of_sizes)
240 
241  template <typename CT, size_t NDIM=0>
242  Status setItem(const int64_t index, const std::vector<CT>& arr)
243 
244  template <typename CT, size_t NDIM=1, bool check_sizes=false>
245  Status setItem(const int64_t index, const std::vector<std::vector<CT>>& item)
246 
247  template <typename CT, size_t NDIM=2, bool check_sizes=false>
248  Status setItem(const int64_t index,
249  const std::vector<std::vector<std::vector<CT>>>& item)
250 
251  template <typename CT, size_t NDIM=1, bool check_sizes=true>
252  Status setItem(const int64_t index,
253  const std::vector<CT>& values,
254  const std::vector<int32_t>& sizes)
255 
256  template <typename CT, size_t NDIM=2, bool check_sizes=true>
257  Status setItem(const int64_t index,
258  const std::vector<CT>& values,
259  const std::vector<int32_t>& sizes,
260  const std::vector<int32_t>& sizes_of_sizes)
261 
262  Status setNull(int64_t index)
263 
264  To test if the item is NULL::
265 
266  Status isNull(index, bool& is_null)
267 
268 
269 
270  FlatBuffer usage
271  ----------------
272 
273  FlatBuffer implements various methods for accessing its content for
274  retrieving or storing data. These methods usually are provided as
275  pairs of safe and unsafe methods. Safe methods validate method
276  inputs and return success or error status depending on the
277  validation results. Unsafe methods (that names have "NoValidation"
278  suffix) performs no validation on the inputs and (almost) always
279  return the success status. Use unsafe methods (for efficency) only
280  when one can prove that the inputs are always valid (e.g. indices
281  are in the expected range, buffer memory allocation is sufficient
282  for storing data, etc). Otherwise, use safe methods that will lead
283  to predictable and non-server-crashing behaviour in the case of
284  possibly invalid input data.
285 */
286 // clang-format on
287 
288 #include <cstring>
289 
290 #ifdef FLATBUFFER_ERROR_ABORTS
291 #include "../../Shared/toString.h"
292 #define RETURN_ERROR(exc) \
293  { \
294  PRINT(exc); \
295  abort(); \
296  return (exc); \
297  }
298 #else
299 #define RETURN_ERROR(exc) return (exc)
300 #endif
301 
302 #include "../../Shared/funcannotations.h"
303 
304 #include <float.h>
305 #ifdef HAVE_TOSTRING
306 #include <string.h>
307 #include <ostream>
308 #endif
309 #include <string>
310 #include <vector>
311 
312 #ifdef __CUDACC__
313 #define FLATBUFFER_UNREACHABLE() \
314  { asm("trap;"); }
315 #else
316 #define FLATBUFFER_UNREACHABLE() \
317  { abort(); }
318 #endif
319 
320 // Notice that the format value is used to recognize if a memory
321 // buffer uses some flat buffer format or not. To minimize chances for
322 // false positive test results, use a non-trival integer value when
323 // introducing new formats.
325  GeoPointFormatId = 0x67656f706f696e74, // hex repr of 'geopoint'
326  NestedArrayFormatId = 0x6e65737465644152 // hex repr of 'nestedAR'
327 };
328 
329 inline int64_t _align_to_int64(int64_t addr) {
330  addr += sizeof(int64_t) - 1;
331  return (int64_t)(((uint64_t)addr >> 3) << 3);
332 }
333 
335  enum ValueType {
349  };
350 
351 #ifdef HAVE_TOSTRING
352  static std::string toString(const ValueType& type);
353 #endif
354 
355  static size_t get_size(ValueType type) {
356  switch (type) {
357  case Bool8:
358  case Int8:
359  case UInt8:
360  return 1;
361  case Int16:
362  case UInt16:
363  return 2;
364  case Int32:
365  case UInt32:
366  case Float32:
367  return 4;
368  case Int64:
369  case UInt64:
370  case Float64:
371  case PointInt32:
372  return 8;
373  case PointFloat64:
374  return 16;
375  }
377  return 0;
378  }
379 
380  /*
381  sizes_t is the type of a container size. Here we use int32_t
382  because Geospatial uses it as the type for the vector of ring and
383  polygon sizes.
384 
385  offsets_t is the type of offsets that is used to locate
386  sub-buffers within the FlatBuffer main buffer. Because NULL items
387  are encoded as negative offset values, the offsets type must be a
388  signed type. Hence, we define offsets_t as int64_t.
389  */
390 
391  typedef int32_t sizes_t;
392  typedef int64_t offsets_t;
393 
394 #define FLATBUFFER_SIZES_T_VALUE_TYPE Int32
395 #define FLATBUFFER_OFFSETS_T_VALUE_TYPE UInt64
396 
397  struct BaseWorker {
400  offsets_t format_metadata_offset; // the offset of the data format metadata buffer
401  offsets_t format_worker_offset; // the offset of the data format worker buffer
402 #ifdef HAVE_TOSTRING
403  std::string toString() const {
404  std::string result = ::typeName(this) + "{";
405  result += "format_id=" + std::to_string(format_id);
406  result += ",\n flatbuffer_size=" + std::to_string(flatbuffer_size);
407  result += ",\n format_metadata_offset=" + std::to_string(format_metadata_offset);
408  result += ",\n format_worker_offset=" + std::to_string(format_worker_offset);
409  result += "}";
410  return result;
411  }
412 #endif
413  };
414 
415  struct GeoPoint {
416  int64_t total_items_count; // the total number of items
417  int32_t input_srid;
418  int32_t output_srid;
419  bool is_geoint;
420 #ifdef HAVE_TOSTRING
421  std::string toString() const {
422  std::string result = ::typeName(this) + "{";
423  result += "total_items_count=" + std::to_string(total_items_count);
424  result += ", input_srid=" + std::to_string(output_srid);
425  result += ", output_srid=" + std::to_string(output_srid);
426  result += ", is_geoint=" + std::to_string(is_geoint);
427  result += "}";
428  return result;
429  }
430 #endif
431  };
432 
433  struct GeoPointWorker {
434  int64_t values_offset; // the offset of values buffer within the flatbuffer memory
435 #ifdef HAVE_TOSTRING
436  std::string toString() const {
437  std::string result = ::typeName(this) + "{";
438  result += "values_offset=" + std::to_string(values_offset);
439  result += "}";
440  return result;
441  }
442 #endif
443  };
444 
447  // all offsets are in bytes
454  size_t value_size;
455 #ifdef HAVE_TOSTRING
456  std::string toString() const {
457  std::string result = ::typeName(this) + "{";
458  result += "specified_items_count=" + std::to_string(specified_items_count);
459  result += ",\n storage_indices_offset=" + std::to_string(storage_indices_offset);
460  result += ",\n sizes_offsets_offset=" + std::to_string(sizes_offsets_offset);
461  result += ",\n values_offsets_offset=" + std::to_string(values_offsets_offset);
462  result += ",\n sizes_buffer_offset=" + std::to_string(sizes_buffer_offset);
463  result += ",\n values_buffer_offset=" + std::to_string(values_buffer_offset);
464  result +=
465  ",\n user_data_buffer_offset=" + std::to_string(user_data_buffer_offset);
466  result += ",\n value_size=" + std::to_string(value_size);
467  result += "}";
468  return result;
469  }
470 #endif
471  };
472 
473  struct NestedArray {
474  size_t ndims;
480 #ifdef HAVE_TOSTRING
481  std::string toString() const {
482  std::string result = ::typeName(this) + "{";
483  result += "ndims=" + std::to_string(ndims);
484  result += ",\n total_items_count=" + std::to_string(total_items_count);
485  result += ",\n total_sizes_count=" + std::to_string(total_sizes_count);
486  result += ",\n total_values_count=" + std::to_string(total_values_count);
487  result += ",\n value_type=" + FlatBufferManager::toString(value_type);
488  result += ",\n user_data_size=" + std::to_string(user_data_size);
489  result += "}";
490  return result;
491  }
492 #endif
493  };
494 
495  enum Status {
496  Success = 0,
516  };
517 
518  // FlatBuffer main buffer. It is the only member of the FlatBuffer struct.
519  int8_t* buffer;
520 
521  // Check if a buffer contains FlatBuffer formatted data. Useful
522  // mostly for sanity checks and debugging.
523  //
524  // WARNING: calling this function on uninitialized buffer will lead
525  // to valgrind memcheck failure. Therefore, if possible, use this
526  // method only in context where SQLTypeInfo::usesFlatBuffer()
527  // returns true.
528  HOST DEVICE static bool isFlatBuffer(const void* buffer) {
529  if (buffer) {
530  // warning: assume that buffer size is at least 8 bytes
531  const auto* base = reinterpret_cast<const BaseWorker*>(buffer);
532  FlatBufferFormat header_format = base->format_id;
533  switch (header_format) {
534  case NestedArrayFormatId:
535  case GeoPointFormatId: {
536  int64_t flatbuffer_size = base->flatbuffer_size;
537  if (flatbuffer_size > 0) {
538  FlatBufferFormat footer_format = static_cast<FlatBufferFormat>(
539  ((int64_t*)buffer)[flatbuffer_size / sizeof(int64_t) - 1]);
540  return footer_format == header_format;
541  }
542  break;
543  }
544  default:
545  break;
546  }
547  }
548  return false;
549  }
550 
551  // Return the allocation size of the the FlatBuffer storage, in bytes
552  // TODO?: return size_t value, 0 when not a flat buffer
553  static int64_t getBufferSize(const void* buffer) {
554  if (isFlatBuffer(buffer)) {
555  return reinterpret_cast<const BaseWorker*>(buffer)->flatbuffer_size;
556  } else {
557  return -1;
558  }
559  }
560 
561  // Return the allocation size of the the FlatBuffer storage, in bytes
562  // TODO?: int64_t -> size_t
563  inline int64_t getBufferSize() const {
564  return reinterpret_cast<const BaseWorker*>(buffer)->flatbuffer_size;
565  }
566 
567  HOST DEVICE inline bool isNestedArray() const {
568  return format() == NestedArrayFormatId;
569  }
570 
571  HOST DEVICE inline size_t getValueSize() const {
572  return getNestedArrayWorker()->value_size;
573  }
574 
575  HOST DEVICE inline size_t getValuesBufferSize() const {
576  const auto* metadata = getNestedArrayMetadata();
577  const auto* worker = getNestedArrayWorker();
578  return worker->value_size * metadata->total_values_count;
579  }
580 
581  HOST DEVICE inline const int8_t* getValuesBuffer() const {
582  const auto offset = getNestedArrayWorker()->values_buffer_offset;
583  return reinterpret_cast<const int8_t*>(buffer + offset);
584  }
585 
586  inline size_t getValuesCount() const {
587  const auto* worker = getNestedArrayWorker();
588  const auto* values_offsets = get_values_offsets();
589  const auto storage_index = worker->specified_items_count;
590  const auto values_offset = values_offsets[storage_index];
591  if (values_offset < 0) {
592  return -(values_offset + 1);
593  }
594  return values_offset;
595  }
596 
597  // Return the format of FlatBuffer
599  const auto* base = reinterpret_cast<const BaseWorker*>(buffer);
600  return base->format_id;
601  }
602 
603  // Return the number of items
604  HOST DEVICE inline int64_t itemsCount() const {
605  switch (format()) {
606  case GeoPointFormatId:
607  return getGeoPointMetadata()->total_items_count;
608  case NestedArrayFormatId:
609  return getNestedArrayMetadata()->total_items_count;
610  default:
611  break;
612  }
613  return -1; // invalid value
614  }
615 
616  // To be deprecated in favor of NestedArray format
617  HOST DEVICE inline int64_t valueByteSize() const {
618  switch (format()) {
619  case GeoPointFormatId:
620  return 2 * (getGeoPointMetadata()->is_geoint ? sizeof(int32_t) : sizeof(double));
621  default:
622  break;
623  }
624  return -1;
625  }
626 
627  // To be deprecated in favor of NestedArray format
628  HOST DEVICE inline int64_t dtypeSize() const { // TODO: use valueByteSize instead
629  switch (format()) {
630  case GeoPointFormatId:
631  return 2 * (getGeoPointMetadata()->is_geoint ? sizeof(int32_t) : sizeof(double));
632  default:
633  break;
634  }
635  return -1;
636  }
637 
638  // To be deprecated in favor of NestedArray format
639  static int64_t compute_flatbuffer_size(FlatBufferFormat format_id,
640  const int8_t* format_metadata_ptr) {
641  int64_t flatbuffer_size = _align_to_int64(sizeof(FlatBufferManager::BaseWorker));
642  switch (format_id) {
643  case GeoPointFormatId: {
644  const auto format_metadata =
645  reinterpret_cast<const GeoPoint*>(format_metadata_ptr);
646  flatbuffer_size += _align_to_int64(sizeof(GeoPoint));
647  flatbuffer_size += _align_to_int64(sizeof(GeoPointWorker));
648  const auto itemsize =
649  2 * (format_metadata->is_geoint ? sizeof(int32_t) : sizeof(double));
650  flatbuffer_size += _align_to_int64(
651  itemsize * format_metadata->total_items_count); // values buffer size
652  break;
653  }
654  default:
656  }
657  flatbuffer_size += _align_to_int64(sizeof(int64_t)); // footer format id
658  return flatbuffer_size;
659  }
660 
662  return reinterpret_cast<FlatBufferManager::BaseWorker*>(buffer);
663  }
664  HOST DEVICE inline const BaseWorker* getBaseWorker() const {
665  return reinterpret_cast<const BaseWorker*>(buffer);
666  }
667 
668 #define FLATBUFFER_MANAGER_FORMAT_TOOLS(TYPENAME) \
669  HOST DEVICE inline TYPENAME##Worker* get##TYPENAME##Worker() { \
670  auto* base = getBaseWorker(); \
671  return reinterpret_cast<TYPENAME##Worker*>(buffer + base->format_worker_offset); \
672  } \
673  HOST DEVICE inline TYPENAME* get##TYPENAME##Metadata() { \
674  auto* base = getBaseWorker(); \
675  return reinterpret_cast<TYPENAME*>(buffer + base->format_metadata_offset); \
676  } \
677  HOST DEVICE inline const TYPENAME##Worker* get##TYPENAME##Worker() const { \
678  const auto* base = getBaseWorker(); \
679  return reinterpret_cast<const TYPENAME##Worker*>(buffer + \
680  base->format_worker_offset); \
681  } \
682  HOST DEVICE inline const TYPENAME* get##TYPENAME##Metadata() const { \
683  const auto* base = getBaseWorker(); \
684  return reinterpret_cast<const TYPENAME*>(buffer + base->format_metadata_offset); \
685  }
686 
687  // To be deprecated in favor of NestedArray format
689 
690 #define FLATBUFFER_MANAGER_FORMAT_TOOLS_NEW(TYPENAME) \
691  HOST DEVICE inline NestedArrayWorker* get##TYPENAME##Worker() { \
692  auto* base = getBaseWorker(); \
693  return reinterpret_cast<NestedArrayWorker*>(buffer + base->format_worker_offset); \
694  } \
695  HOST DEVICE inline TYPENAME* get##TYPENAME##Metadata() { \
696  auto* base = getBaseWorker(); \
697  return reinterpret_cast<TYPENAME*>(buffer + base->format_metadata_offset); \
698  } \
699  HOST DEVICE inline const NestedArrayWorker* get##TYPENAME##Worker() const { \
700  const auto* base = getBaseWorker(); \
701  return reinterpret_cast<const NestedArrayWorker*>(buffer + \
702  base->format_worker_offset); \
703  } \
704  HOST DEVICE inline const TYPENAME* get##TYPENAME##Metadata() const { \
705  const auto* base = getBaseWorker(); \
706  return reinterpret_cast<const TYPENAME*>(buffer + base->format_metadata_offset); \
707  }
708 
709  FLATBUFFER_MANAGER_FORMAT_TOOLS(NestedArray);
710 
711 #undef FLATBUFFER_MANAGER_FORMAT_TOOLS
712 #undef FLATBUFFER_MANAGER_FORMAT_TOOLS_NEW
713 
714 #define FLATBUFFER_MANAGER_SET_OFFSET(OBJ, NAME, SIZE) \
715  offset = OBJ->NAME##_offset = offset + _align_to_int64(previous_size); \
716  previous_size = SIZE;
717 
718  static int64_t computeBufferSizeNestedArray(int64_t ndims,
719  int64_t total_items_count,
720  int64_t total_sizes_count,
721  int64_t total_values_count,
722  ValueType value_type,
723  size_t user_data_size) {
724  size_t value_size = get_size(value_type);
725  offsets_t flatbuffer_size = _align_to_int64(sizeof(FlatBufferManager::BaseWorker));
726  flatbuffer_size += _align_to_int64(sizeof(NestedArray));
727  flatbuffer_size += _align_to_int64(sizeof(NestedArrayWorker));
728  flatbuffer_size +=
729  _align_to_int64(value_size * (total_values_count + 1)); // values buffer
730  flatbuffer_size +=
731  _align_to_int64(sizeof(sizes_t) * total_sizes_count); // sizes buffer
732  flatbuffer_size +=
733  _align_to_int64(sizeof(offsets_t) * (total_items_count + 1)); // values offsets
734  flatbuffer_size += _align_to_int64(sizeof(offsets_t) *
735  (total_items_count * ndims + 1)); // sizes offsets
736  flatbuffer_size += _align_to_int64(
737  sizeof(sizes_t) * total_items_count); // storage indices, must use signed type
738  flatbuffer_size += _align_to_int64(user_data_size); // user data
739  flatbuffer_size += _align_to_int64(sizeof(int64_t)); // format id
740  return flatbuffer_size;
741  }
742 
744  int64_t total_items_count,
745  int64_t total_sizes_count,
746  int64_t total_values_count,
747  ValueType value_type,
748  const int8_t* null_value_ptr,
749  const int8_t* user_data_ptr,
750  size_t user_data_size) {
751  auto* base = getBaseWorker();
752  base->format_id = NestedArrayFormatId;
753  size_t value_size = get_size(value_type);
754  base->flatbuffer_size = computeBufferSizeNestedArray(ndims,
755  total_items_count,
756  total_sizes_count,
757  total_values_count,
758  value_type,
759  user_data_size);
760  offsets_t offset = 0;
761  size_t previous_size = sizeof(FlatBufferManager::BaseWorker);
762  FLATBUFFER_MANAGER_SET_OFFSET(base, format_metadata, sizeof(NestedArray));
763  FLATBUFFER_MANAGER_SET_OFFSET(base, format_worker, sizeof(NestedArrayWorker));
764 
765  auto* metadata = getNestedArrayMetadata();
766  metadata->ndims = ndims;
767  metadata->total_items_count = total_items_count;
768  metadata->total_sizes_count = total_sizes_count;
769  metadata->total_values_count = total_values_count;
770  metadata->value_type = value_type;
771  metadata->user_data_size = user_data_size;
772 
773  auto* worker = getNestedArrayWorker();
774  worker->specified_items_count = 0;
775  worker->value_size = value_size;
776 
778  worker, values_buffer, value_size * (total_values_count + 1));
780  worker, sizes_buffer, sizeof(sizes_t) * total_sizes_count);
782  worker, values_offsets, sizeof(offsets_t) * (total_items_count + 1));
784  worker, sizes_offsets, sizeof(offsets_t) * (total_items_count * ndims + 1));
786  worker, storage_indices, sizeof(sizes_t) * total_items_count);
787  FLATBUFFER_MANAGER_SET_OFFSET(worker, user_data_buffer, user_data_size);
788 
789  if (base->flatbuffer_size !=
790  offset + _align_to_int64(previous_size) + _align_to_int64(sizeof(int64_t))) {
792  }
793 
794  offsets_t* values_offsets = get_values_offsets();
795  offsets_t* sizes_offsets = get_sizes_offsets();
796  values_offsets[0] = 0;
797  sizes_offsets[0] = 0;
798  sizes_t* storage_indices = get_storage_indices();
799  for (int i = 0; i < total_items_count; i++) {
800  storage_indices[i] = -1;
801  }
802 
803  // the last value in values_buffer stores a null value:
804  int8_t* null_value_buffer = get_values_buffer() + value_size * total_values_count;
805  if (null_value_ptr != nullptr) {
806  if (memcpy(null_value_buffer, null_value_ptr, value_size) == nullptr) {
808  }
809  } else {
810  if (memset(null_value_buffer, 0, value_size) == nullptr) {
812  }
813  }
814 
815  if (user_data_size > 0 && user_data_ptr != nullptr) {
816  int8_t* user_data_buffer = get_user_data_buffer();
817  if (memcpy(user_data_buffer, user_data_ptr, user_data_size) == nullptr) {
819  }
820  }
821 
822  ((int64_t*)buffer)[base->flatbuffer_size / sizeof(int64_t) - 1] =
823  static_cast<int64_t>(base->format_id);
824 
825  if (isFlatBuffer(buffer)) {
826  // make sure that initialization leads to a valid FlatBuffer
827  return Success;
828  }
830  }
831 
832  // To be deprecated in favor of NestedArray format
833  void initialize(FlatBufferFormat format_id, const int8_t* format_metadata_ptr) {
834  auto* base = getBaseWorker();
835  base->format_id = format_id;
836  base->flatbuffer_size = compute_flatbuffer_size(format_id, format_metadata_ptr);
837  base->format_metadata_offset = _align_to_int64(sizeof(FlatBufferManager::BaseWorker));
838  switch (format_id) {
839  case NestedArrayFormatId:
841  break;
842  case GeoPointFormatId: {
843  base->format_worker_offset =
844  base->format_metadata_offset + _align_to_int64(sizeof(GeoPoint));
845 
846  const auto* format_metadata =
847  reinterpret_cast<const GeoPoint*>(format_metadata_ptr);
848  auto* this_metadata = getGeoPointMetadata();
849  this_metadata->total_items_count = format_metadata->total_items_count;
850  this_metadata->input_srid = format_metadata->input_srid;
851  this_metadata->output_srid = format_metadata->output_srid;
852  this_metadata->is_geoint = format_metadata->is_geoint;
853 
854  auto* this_worker = getGeoPointWorker();
855  this_worker->values_offset =
856  base->format_worker_offset + _align_to_int64(sizeof(GeoPointWorker));
857  break;
858  }
859  }
860  ((int64_t*)buffer)[base->flatbuffer_size / sizeof(int64_t) - 1] =
861  static_cast<int64_t>(format_id);
862  }
863 
864  // Low-level API
865 
866  HOST DEVICE inline size_t getNDims() const {
867  if (isNestedArray()) {
868  return getNestedArrayMetadata()->ndims;
869  }
871  return 0;
872  }
873 
874  // Return the upper bound to the total number of points in all items
875  // To be deprecated in favor of NestedArray format
876  inline int64_t get_max_nof_values() const {
877  switch (format()) {
878  case GeoPointFormatId:
879  return getGeoPointMetadata()->total_items_count;
880  default:
881  break;
882  }
883  return -1;
884  }
885 
886  // Return the number of specified items
887  // To be deprecated in favor of NestedArray format
888  HOST DEVICE inline int64_t& get_storage_count() {
889  switch (format()) {
890  case GeoPointFormatId:
891  return getGeoPointMetadata()->total_items_count;
892  default:
893  break;
894  }
895  static int64_t dummy_storage_count = -1;
896  return dummy_storage_count;
897  }
898 
899  // To be deprecated in favor of NestedArray format
900  inline const int64_t& get_storage_count() const {
901  switch (format()) {
902  case GeoPointFormatId:
903  return getGeoPointMetadata()->total_items_count;
904  default:
905  break;
906  }
907  static int64_t dummy = -1;
908  return dummy;
909  }
910 
911  // Return the size of values buffer in bytes
912  // To be deprecated in favor of NestedArray format
913  inline int64_t get_values_buffer_size() const {
914  switch (format()) {
915  case GeoPointFormatId: {
916  const auto* metadata = getGeoPointMetadata();
917  const auto itemsize =
918  2 * (metadata->is_geoint ? sizeof(int32_t) : sizeof(double));
919  return _align_to_int64(itemsize * metadata->total_items_count);
920  }
921  default:
922  break;
923  }
924  static int64_t dummy = -1;
925  return dummy;
926  }
927 
928  // Return the pointer to values buffer
929  // To be deprecated in favor of NestedArray format
930  HOST DEVICE inline int8_t* get_values() {
931  int64_t offset = 0;
932  switch (format()) {
933  case GeoPointFormatId:
934  offset = getGeoPointWorker()->values_offset;
935  break;
936  default:
937  return nullptr;
938  }
939  return buffer + offset;
940  }
941 
942  // To be deprecated in favor of NestedArray format
943  HOST DEVICE inline const int8_t* get_values() const {
944  int64_t offset = 0;
945  switch (format()) {
946  case GeoPointFormatId:
947  offset = getGeoPointWorker()->values_offset;
948  break;
949  default:
950  return nullptr;
951  }
952  return buffer + offset;
953  }
954 
955 #define FLATBUFFER_GET_BUFFER_METHODS(BUFFERNAME, BUFFERTYPE) \
956  HOST DEVICE inline BUFFERTYPE* get_##BUFFERNAME() { \
957  int64_t offset = 0; \
958  switch (format()) { \
959  case NestedArrayFormatId: \
960  offset = getNestedArrayWorker()->BUFFERNAME##_offset; \
961  break; \
962  default: \
963  return nullptr; \
964  } \
965  return reinterpret_cast<BUFFERTYPE*>(buffer + offset); \
966  } \
967  HOST DEVICE inline const BUFFERTYPE* get_##BUFFERNAME() const { \
968  int64_t offset = 0; \
969  switch (format()) { \
970  case NestedArrayFormatId: \
971  offset = getNestedArrayWorker()->BUFFERNAME##_offset; \
972  break; \
973  default: \
974  return nullptr; \
975  } \
976  return reinterpret_cast<BUFFERTYPE*>(buffer + offset); \
977  }
978 
979  FLATBUFFER_GET_BUFFER_METHODS(user_data_buffer, int8_t);
980  FLATBUFFER_GET_BUFFER_METHODS(values_buffer, int8_t);
981  FLATBUFFER_GET_BUFFER_METHODS(sizes_buffer, sizes_t);
982  FLATBUFFER_GET_BUFFER_METHODS(values_offsets, offsets_t);
984 
985 #undef FLATBUFFER_GET_BUFFER_METHODS
986 
987  HOST DEVICE inline const int8_t* getNullValuePtr() const {
988  if (isNestedArray()) {
989  return get_values_buffer() + getValuesBufferSize();
990  }
991  return nullptr;
992  }
993 
994  HOST DEVICE inline bool containsNullValue(const int8_t* value_ptr) const {
995  const int8_t* null_value_ptr = getNullValuePtr();
996  if (null_value_ptr != nullptr) {
997  switch (getValueSize()) {
998  case 1:
999  return *null_value_ptr == *value_ptr;
1000  case 2:
1001  return *reinterpret_cast<const int16_t*>(null_value_ptr) ==
1002  *reinterpret_cast<const int16_t*>(value_ptr);
1003  case 4:
1004  return *reinterpret_cast<const int32_t*>(null_value_ptr) ==
1005  *reinterpret_cast<const int32_t*>(value_ptr);
1006  case 8:
1007  return *reinterpret_cast<const int64_t*>(null_value_ptr) ==
1008  *reinterpret_cast<const int64_t*>(value_ptr);
1009  case 16:
1010  return (*reinterpret_cast<const int64_t*>(null_value_ptr) ==
1011  *reinterpret_cast<const int64_t*>(value_ptr) &&
1012  *(reinterpret_cast<const int64_t*>(null_value_ptr) + 1) ==
1013  *(reinterpret_cast<const int64_t*>(value_ptr) + 1));
1014  default:
1015  break;
1016  }
1017  }
1018  return false;
1019  }
1020 
1022  offsets_t offset = 0;
1023  switch (format()) {
1024  case NestedArrayFormatId:
1025  offset = getNestedArrayWorker()->storage_indices_offset;
1026  break;
1027  default:
1028  return nullptr;
1029  }
1030  return reinterpret_cast<sizes_t*>(buffer + offset);
1031  }
1032 
1033  HOST DEVICE inline const sizes_t* get_storage_indices() const {
1034  offsets_t offset = 0;
1035  switch (format()) {
1036  case NestedArrayFormatId:
1037  offset = getNestedArrayWorker()->storage_indices_offset;
1038  break;
1039  default:
1040  return nullptr;
1041  }
1042  return reinterpret_cast<sizes_t*>(buffer + offset);
1043  }
1044 
1045  HOST DEVICE inline sizes_t get_storage_index(const int64_t index) const {
1046  return get_storage_indices()[index];
1047  }
1048 
1049  // High-level API
1050 
1051  template <size_t NDIM>
1052  HOST DEVICE Status isNull(const int64_t index[NDIM], const size_t n, bool& is_null) {
1053  if (isNestedArray()) {
1054  NestedArrayItem<NDIM> item;
1055  auto status = getItem<NDIM>(index, n, item);
1056  if (status != Success) {
1057  RETURN_ERROR(status);
1058  }
1059  is_null = item.is_null;
1060  return Success;
1061  }
1063  }
1064 
1065  template <size_t NDIM = 1>
1066  HOST DEVICE Status getLength(const int64_t index, size_t& length) {
1067  const int64_t index_[NDIM] = {index};
1068  return getLength<NDIM>(index_, 1, length);
1069  }
1070 
1071  // This getLength method is a worker method of accessing the
1072  // flatbuffer content.
1073  template <size_t NDIM>
1074  HOST DEVICE Status getLength(const int64_t index[NDIM],
1075  const size_t n,
1076  size_t& length) const {
1077  if (!isNestedArray()) {
1079  }
1080  const size_t ndims = getNDims();
1081  if (n == 0) {
1082  length = itemsCount();
1083  return Success;
1084  }
1085  if (n > ndims + 1) {
1087  }
1088  if (index[0] < 0 || index[0] >= itemsCount()) {
1089  // Callers may interpret the IndexError as a NULL value return
1090  return IndexError;
1091  }
1092  const auto storage_index = get_storage_index(index[0]);
1093  if (storage_index < 0) {
1094  return ItemUnspecifiedError;
1095  }
1096  const auto* values_offsets = get_values_offsets();
1097  const auto values_offset = values_offsets[storage_index];
1098  if (values_offset < 0) { // NULL item
1099  length = 0;
1100  return Success;
1101  }
1102  const auto* sizes_offsets = get_sizes_offsets();
1103  const auto* sizes_buffer = get_sizes_buffer();
1104  const auto sizes_offset = sizes_offsets[storage_index * ndims];
1105  switch (n) {
1106  case 1: {
1107  length = sizes_buffer[sizes_offset];
1108  } break;
1109  case 2: {
1110  const auto sizes2_offset = sizes_offsets[storage_index * ndims + 1];
1111  if (index[1] < 0 || index[1] >= sizes_buffer[sizes_offset]) {
1113  }
1114  length = sizes_buffer[sizes2_offset + index[1]];
1115  } break;
1116  case 3: {
1117  const auto sizes2_offset = sizes_offsets[storage_index * ndims + 1];
1118  const auto sizes3_offset = sizes_offsets[storage_index * ndims + 2];
1119  if (index[1] < 0 || index[1] >= sizes_buffer[sizes_offset]) {
1121  }
1122  if (index[2] < 0 || index[2] >= sizes_buffer[sizes2_offset + index[1]]) {
1124  }
1125  offsets_t soffset = 0;
1126  for (int64_t i = 0; i < index[1]; i++) {
1127  soffset += sizes_buffer[sizes2_offset + i];
1128  }
1129  length = sizes_buffer[sizes3_offset + soffset + index[2]];
1130  } break;
1131  default:
1133  break;
1134  }
1135  return Success;
1136  }
1137 
1138  // This getItem method is a worker method of accessing the
1139  // flatbuffer content.
1140  template <size_t NDIM>
1141  HOST DEVICE Status getItemWorker(const int64_t index[NDIM],
1142  const size_t n,
1143  int8_t*& values,
1144  int32_t& nof_values,
1145  int32_t* sizes_buffers[NDIM],
1146  int32_t sizes_lengths[NDIM],
1147  int32_t& nof_sizes,
1148  bool& is_null) {
1149  values = nullptr;
1150  nof_values = 0;
1151  nof_sizes = 0;
1152  std::memset(sizes_buffers, 0, NDIM * sizeof(int32_t*));
1153  std::memset(sizes_lengths, 0, NDIM * sizeof(int32_t));
1154  is_null = true;
1155 
1156  if (!isNestedArray()) {
1158  }
1159 
1160  const size_t ndims = getNDims();
1161  if (n <= 0 || n > ndims + 1) {
1163  }
1164  // clang-format off
1165  /*
1166  multipolygon (ndims == 3):
1167 
1168  n == 0 means return a column of multipolygons: flatbuffer, getLenght returns
1169  itemsCount()
1170 
1171  n == 1 means return a multipolygon: values, sizes(=sizes_buffers[1]),
1172  sizes_of_sizes(=sizes_buffers[0]), getLength returns
1173  len(sizes_of_sizes)(=sizes_lengths[0])
1174 
1175  n == 2 means return a polygon: values, sizes, getLength
1176  returns len(sizes)
1177 
1178  n == 3 means return a linestring: values, getLength returns
1179  len(values)
1180 
1181  n == 4 means return a point: value, getLength returns 0 [NOTIMPL]
1182 
1183  polygon/multilinestring (ndims == 2):
1184 
1185  n == 0 means return a column of polygons/multilinestring:
1186  flatbuffer, getLenght returns itemsCount()
1187 
1188  n == 1 means return a polygon/multilinestring: values, sizes,
1189  getLength returns len(sizes)
1190 
1191  n == 2 means return a linestring: values, getLength
1192  returns len(values)
1193 
1194  n == 3 means return a point: value, getLength returns 0 [NOTIMPL]
1195 
1196  linestring/multipoint/array of scalars/textencodingnone (ndims == 1):
1197 
1198  n == 0 means return a column of linestring/multipoint:
1199  flatbuffer, getLenght returns itemsCount()
1200 
1201  n == 1 means return a linestring: values, getLength returns
1202  len(values)
1203 
1204  n == 2 means return a point: value, getLength returns 0 [NOTIMPL]
1205  */
1206  // clang-format off
1207  if (index[0] < 0 || index[0] >= itemsCount()) {
1208  // Callers may interpret the IndexError as a NULL value return
1209  return IndexError;
1210  }
1211  const auto storage_index = get_storage_index(index[0]);
1212  if (storage_index < 0) {
1213  return ItemUnspecifiedError;
1214  }
1215  const auto* values_offsets = get_values_offsets();
1216  const auto values_offset = values_offsets[storage_index];
1217  if (values_offset < 0) {
1218  return Success;
1219  }
1220  is_null = false;
1221  const auto* sizes_offsets = get_sizes_offsets();
1222  auto* sizes_buffer = get_sizes_buffer();
1223  auto* values_buffer = get_values_buffer();
1224  const auto valuesize = getValueSize();
1225  const auto next_values_offset = values_offsets[storage_index + 1];
1226 
1227  const auto sizes_offset = sizes_offsets[storage_index * ndims];
1228  nof_sizes = ndims - n;
1229  switch (n) {
1230  case 1: {
1231  if (next_values_offset < 0) {
1232  nof_values = -(next_values_offset + 1) - values_offset;
1233  } else {
1234  nof_values = next_values_offset - values_offset;
1235  }
1236  values = values_buffer + values_offset * valuesize;
1237  switch (ndims) {
1238  case 3: {
1239  const auto sizes2_offset = sizes_offsets[storage_index * ndims + 1];
1240  const auto sizes3_offset = sizes_offsets[storage_index * ndims + 2];
1241  sizes_buffers[0] = sizes_buffer + sizes2_offset;
1242  sizes_buffers[1] = sizes_buffer + sizes3_offset;
1243  sizes_lengths[0] = sizes_buffer[sizes_offset];
1244  sizes_lengths[1] = sizes_offsets[storage_index * ndims + 3] - sizes3_offset;
1245  break;
1246  }
1247  case 2: {
1248  const auto sizes2_offset = sizes_offsets[storage_index * ndims + 1];
1249  sizes_buffers[0] = sizes_buffer + sizes2_offset;
1250  sizes_lengths[0] = sizes_buffer[sizes_offset];
1251  break;
1252  }
1253  case 1:
1254  break;
1255  default:
1257  break;
1258  }
1259  break;
1260  }
1261  case 2: {
1262  const auto sizes2_offset = sizes_offsets[storage_index * ndims + 1];
1263  if (index[1] < 0 || index[1] >= sizes_buffer[sizes_offset]) {
1265  }
1266  offsets_t soffset = 0;
1267  for (int64_t i = 0; i < index[1]; i++) {
1268  soffset += sizes_buffer[sizes2_offset + i];
1269  }
1270  values = values_buffer + (values_offset + soffset) * valuesize;
1271  switch (ndims) {
1272  case 3: {
1273  const auto sizes3_offset = sizes_offsets[storage_index * ndims + 2];
1274  const sizes_t nsizes = sizes_buffer[sizes2_offset + index[1]];
1275  auto sizes_buf = sizes_buffer + sizes3_offset + soffset;
1276  sizes_buffers[0] = sizes_buf;
1277  sizes_lengths[0] = nsizes;
1278  nof_values = 0;
1279  for (int64_t i = 0; i < nsizes; i++) {
1280  nof_values += sizes_buf[i];
1281  }
1282  break;
1283  }
1284  case 2: {
1285  nof_values = sizes_buffer[sizes2_offset + index[1]];
1286  break;
1287  }
1288  case 1: {
1289  nof_values = 1;
1290  is_null = containsNullValue(values);
1291  break;
1292  }
1293  default:
1295  break;
1296  }
1297  break;
1298  }
1299  case 3: {
1300  if (ndims != 3) {
1302  }
1303  const auto sizes2_offset = sizes_offsets[storage_index * ndims + 1];
1304  const auto sizes3_offset = sizes_offsets[storage_index * ndims + 2];
1305  if (index[1] < 0 || index[1] >= sizes_buffer[sizes_offset]) {
1307  }
1308  if (index[2] < 0 || index[2] >= sizes_buffer[sizes2_offset + index[1]]) {
1310  }
1311 
1312  int64_t i3 = 0;
1313  int64_t soffset = 0;
1314  int64_t voffset = 0;
1315  for (int64_t i = 0; i < index[1]; i++) {
1316  auto size2 = sizes_buffer[sizes2_offset + i];
1317  soffset += size2;
1318  for (int64_t j = 0; j < size2; j++) {
1319  voffset += sizes_buffer[sizes3_offset + i3];
1320  i3++;
1321  }
1322  }
1323  for (int64_t j = 0; j < index[2]; j++) {
1324  voffset += sizes_buffer[sizes3_offset + i3];
1325  i3++;
1326  }
1327  values = values_buffer + (values_offset + voffset) * valuesize;
1328  nof_values = sizes_buffer[sizes3_offset + soffset + index[2]];
1329  break;
1330  }
1331  default:
1333  break;
1334  }
1335  return Success;
1336  }
1337 
1338  template <size_t NDIM>
1340  int8_t* values;
1341  int32_t nof_values;
1342  int32_t* sizes_buffers[NDIM];
1343  int32_t sizes_lengths[NDIM];
1344  int32_t nof_sizes;
1345  bool is_null;
1346  };
1347 
1348  template <size_t NDIM>
1350  const int64_t index_[NDIM] = {index};
1351  return getItem<NDIM>(index_, 1, result);
1352  }
1353 
1354  template <size_t NDIM>
1355  HOST DEVICE Status getItem(const int64_t index[NDIM], const size_t n, NestedArrayItem<NDIM>& result) {
1356  return getItemWorker<NDIM>(index, n,
1357  result.values,
1358  result.nof_values,
1359  result.sizes_buffers,
1360  result.sizes_lengths,
1361  result.nof_sizes,
1362  result.is_null);
1363  }
1364 
1365  // This setItem method is a worker method of initializing the
1366  // flatbuffer content. It can be called once per index value.
1367  template <size_t NDIM, bool check_sizes=true>
1368  HOST DEVICE Status setItemWorker(const int64_t index,
1369  const int8_t* values,
1370  const int32_t nof_values,
1371  const int32_t* const sizes_buffers[NDIM],
1372  const int32_t sizes_lengths[NDIM],
1373  const int32_t nof_sizes) {
1374  if (format() != NestedArrayFormatId) {
1376  }
1377  if (index < 0 || index >= itemsCount()) {
1379  }
1380  const int32_t ndims = getNDims();
1381  if (nof_sizes + 1 != ndims) {
1383  }
1384 
1385  auto* storage_indices = get_storage_indices();
1386  if (storage_indices[index] >= 0) {
1388  }
1389  auto* worker = getNestedArrayWorker();
1390  const auto storage_index = worker->specified_items_count;
1391  storage_indices[index] = storage_index;
1392  worker->specified_items_count++;
1393 
1394  auto* values_offsets = get_values_offsets();
1395  auto* sizes_offsets = get_sizes_offsets();
1396  auto* sizes_buffer = get_sizes_buffer();
1397  auto* values_buffer = get_values_buffer();
1398  const auto* metadata = getNestedArrayMetadata();
1399  const auto valuesize = getValueSize();
1400 
1401  auto values_offset = values_offsets[storage_index];
1402  const auto sizes_offset = sizes_offsets[storage_index * ndims];
1403  if (values_offset + nof_values > metadata->total_values_count) {
1405  }
1406 
1407  switch (ndims) {
1408  case 1: {
1409  sizes_buffer[sizes_offset] = nof_values;
1410  sizes_offsets[storage_index * ndims + 1] = sizes_offset + 1;
1411  } break;
1412  case 2: {
1413  const auto sizes2_offset = sizes_offset + 1;
1414  if (sizes2_offset + sizes_lengths[0] > metadata->total_sizes_count) {
1416  }
1417  sizes_buffer[sizes_offset] = sizes_lengths[0];
1418  if constexpr (check_sizes) {
1419  // check consistency of sizes and nof_values
1420  int32_t sum_of_sizes = 0;
1421  for (int32_t i=0; i < sizes_lengths[0]; i++) {
1422  sum_of_sizes += sizes_buffers[0][i];
1423  }
1424  if (nof_values != sum_of_sizes) {
1426  }
1427  }
1428  if (memcpy(sizes_buffer + sizes2_offset,
1429  sizes_buffers[0],
1430  sizes_lengths[0] * sizeof(sizes_t)) == nullptr) {
1432  }
1433  sizes_offsets[storage_index * ndims + 1] = sizes2_offset;
1434  sizes_offsets[storage_index * ndims + 2] = sizes2_offset + sizes_lengths[0];
1435  } break;
1436  case 3: {
1437  const auto sizes2_offset = sizes_offset + 1;
1438  const auto sizes3_offset = sizes2_offset + sizes_lengths[0];
1439  if (sizes2_offset + sizes_lengths[0] + sizes_lengths[1] >
1440  metadata->total_sizes_count) {
1442  }
1443  sizes_buffer[sizes_offset] = sizes_lengths[0];
1444  if constexpr (check_sizes) {
1445  // check consistency of sizes of sizes and nof_sizes
1446  int32_t sum_of_sizes_of_sizes = 0;
1447  for (int32_t i=0; i < sizes_lengths[0]; i++) {
1448  sum_of_sizes_of_sizes += sizes_buffers[0][i];
1449  }
1450  if (sizes_lengths[1] != sum_of_sizes_of_sizes) {
1452  }
1453  }
1454  if (memcpy(sizes_buffer + sizes2_offset,
1455  sizes_buffers[0],
1456  sizes_lengths[0] * sizeof(sizes_t)) == nullptr) {
1458  }
1459  if constexpr (check_sizes) {
1460  // check consistency of sizes and nof_values
1461  int32_t sum_of_sizes = 0;
1462  for (int32_t i=0; i < sizes_lengths[1]; i++) {
1463  sum_of_sizes += sizes_buffers[1][i];
1464  }
1465  if (nof_values != sum_of_sizes) {
1467  }
1468  }
1469  if (memcpy(sizes_buffer + sizes3_offset,
1470  sizes_buffers[1],
1471  sizes_lengths[1] * sizeof(sizes_t)) == nullptr) {
1473  }
1474  sizes_offsets[storage_index * ndims + 1] = sizes2_offset;
1475  sizes_offsets[storage_index * ndims + 2] = sizes3_offset;
1476  sizes_offsets[storage_index * ndims + 3] = sizes3_offset + sizes_lengths[1];
1477  } break;
1478  default:
1480  break;
1481  }
1482  if (values != nullptr) {
1483  if (memcpy(values_buffer + values_offset * valuesize,
1484  values,
1485  nof_values * valuesize) == nullptr) {
1487  }
1488  }
1489  values_offsets[storage_index + 1] = values_offset + nof_values;
1490  return Success;
1491  }
1492 
1493  // This concatItem method is a worker method of initializing the
1494  // flatbuffer content. It can be called once per index value, or
1495  // multiple times but only if the index is the last specified item
1496  // index.
1497  template <size_t NDIM, bool check_sizes=true>
1498  HOST DEVICE Status concatItemWorker(const int64_t index,
1499  const int8_t* values,
1500  const int32_t nof_values,
1501  const int32_t* const sizes_buffers[NDIM],
1502  const int32_t sizes_lengths[NDIM],
1503  const int32_t nof_sizes) {
1504  if (format() != NestedArrayFormatId) {
1506  }
1507  if (index < 0 || index >= itemsCount()) {
1509  }
1510  const int32_t ndims = getNDims();
1511  if (nof_sizes + 1 != ndims) {
1513  }
1514  auto* storage_indices = get_storage_indices();
1515  auto storage_index = storage_indices[index];
1516  if (storage_index == -1) { // unspecified, so setting the item
1517  return setItemWorker<NDIM, check_sizes>(index,
1518  values,
1519  nof_values,
1520  sizes_buffers,
1521  sizes_lengths,
1522  nof_sizes);
1523  }
1524  auto* worker = getNestedArrayWorker();
1525  if (storage_index != worker->specified_items_count - 1) {
1526  // index does not correspond to the last item, only the last
1527  // item can be concatenated atm.
1528  // TODO: to support intermediate item concatenation, we'll need
1529  // to move the content of sizes and values buffers of the
1530  // following items to make extra room for the to-be concatenated
1531  // item buffers.
1533  }
1534  auto* values_offsets = get_values_offsets();
1535  auto values_offset = values_offsets[storage_index];
1536  if (values_offset < 0) {
1537  // TODO: support concat to null
1539  }
1540  auto* values_buffer = get_values_buffer();
1541  auto* sizes_offsets = get_sizes_offsets();
1542  auto* sizes_buffer = get_sizes_buffer();
1543  const auto sizes_offset = sizes_offsets[storage_index * ndims];
1544  const auto* metadata = getNestedArrayMetadata();
1545  const auto valuesize = getValueSize();
1546  values_offset += sizes_buffer[sizes_offset];
1547  if (values_offset + nof_values > metadata->total_values_count) {
1549  }
1550  switch (ndims) {
1551  case 1:
1552  sizes_buffer[sizes_offset] += nof_values;
1553  break;
1554  case 2: {
1555  const auto sizes2_offset = sizes_offset + 1;
1556  if (sizes2_offset + sizes_buffer[sizes_offset] + sizes_lengths[0] > metadata->total_sizes_count) {
1558  }
1559  if constexpr (check_sizes) {
1560  // check consistency of sizes and nof_values
1561  int32_t sum_of_sizes = 0;
1562  for (int32_t i=0; i < sizes_lengths[0]; i++) {
1563  sum_of_sizes += sizes_buffers[0][i];
1564  }
1565  if (nof_values != sum_of_sizes) {
1567  }
1568  }
1569  if (memcpy(sizes_buffer + sizes2_offset + sizes_buffer[sizes_offset],
1570  sizes_buffers[0],
1571  sizes_lengths[0] * sizeof(sizes_t)) == nullptr) {
1573  }
1574  sizes_buffer[sizes_offset] += sizes_lengths[0];
1575  sizes_offsets[storage_index * ndims + 2] += sizes_lengths[0];
1576  }
1577  break;
1578  case 3: // TODO: concat multipolygons
1580  default:
1582  break;
1583  }
1584  if (nof_values > 0 && values != nullptr &&
1585  memcpy(values_buffer + values_offset * valuesize, values, nof_values * valuesize) == nullptr) {
1587  }
1588  values_offsets[storage_index + 1] += nof_values;
1589  return Success;
1590  }
1591 
1592  template <size_t NDIM=1, bool check_sizes=true>
1593  HOST DEVICE Status setItem(const int64_t index,
1594  const int8_t* values_buf,
1595  const int32_t nof_values) {
1596  const int32_t* const sizes_buffers[1] = {nullptr};
1597  int32_t sizes_lengths[1] = {0};
1598  return setItemWorker<1, check_sizes>(index,
1599  values_buf,
1600  nof_values,
1601  sizes_buffers,
1602  sizes_lengths,
1603  0);
1604  }
1605 
1606  template <size_t NDIM=1, bool check_sizes=true>
1607  HOST DEVICE Status setItem(const int64_t index,
1608  const int8_t* values_buf,
1609  const int32_t nof_values,
1610  const int32_t* sizes_buf,
1611  const int32_t nof_sizes) {
1612  const int32_t* const sizes_buffers[NDIM] = {sizes_buf};
1613  int32_t sizes_lengths[NDIM] = {nof_sizes};
1614  return setItemWorker<NDIM, check_sizes>(index,
1615  values_buf,
1616  nof_values,
1617  sizes_buffers,
1618  sizes_lengths,
1619  static_cast<int32_t>(NDIM));
1620  }
1621 
1622  template <size_t NDIM=2, bool check_sizes=true>
1623  HOST DEVICE Status setItem(const int64_t index,
1624  const int8_t* values_buf,
1625  const int32_t nof_values,
1626  const int32_t* sizes_buf,
1627  const int32_t nof_sizes,
1628  const int32_t* sizes_of_sizes_buf,
1629  const int32_t nof_sizes_of_sizes) {
1630  const int32_t* const sizes_buffers[NDIM] = {sizes_of_sizes_buf, sizes_buf};
1631  int32_t sizes_lengths[NDIM] = {nof_sizes_of_sizes, nof_sizes};
1632  return setItemWorker<NDIM, check_sizes>(index,
1633  values_buf,
1634  nof_values,
1635  sizes_buffers,
1636  sizes_lengths,
1637  static_cast<int32_t>(NDIM));
1638  }
1639 
1640  template <size_t NDIM=1, bool check_sizes=true>
1641  HOST DEVICE Status setItem(const int64_t index,
1642  const NestedArrayItem<NDIM>& item) {
1643  if (item.is_null) {
1644  return Success;
1645  }
1646  return setItemWorker<NDIM, check_sizes>(index,
1647  item.values,
1648  item.nof_values,
1649  item.sizes_buffers,
1650  item.sizes_lengths,
1651  item.nof_sizes);
1652  }
1653 
1654  template <size_t NDIM=0, bool check_sizes=true>
1655  HOST DEVICE Status concatItem(const int64_t index,
1656  const int8_t* values_buf,
1657  const int32_t nof_values) {
1658  const int32_t* const sizes_buffers[1] = {nullptr};
1659  int32_t sizes_lengths[1] = {0};
1660  return concatItemWorker<1, check_sizes>(index,
1661  values_buf,
1662  nof_values,
1663  sizes_buffers,
1664  sizes_lengths,
1665  0);
1666  }
1667 
1668  template <size_t NDIM=1, bool check_sizes=true>
1669  HOST DEVICE Status concatItem(const int64_t index,
1670  const NestedArrayItem<NDIM>& item) {
1671  if (item.is_null) {
1672  return Success;
1673  }
1674  return concatItemWorker<NDIM, check_sizes>(index,
1675  item.values,
1676  item.nof_values,
1677  item.sizes_buffers,
1678  item.sizes_lengths,
1679  item.nof_sizes);
1680  }
1681 
1682  Status getItem(const int64_t index, std::string& s, bool& is_null) {
1683  is_null = true;
1684  const auto* metadata = getNestedArrayMetadata();
1685  if (metadata->value_type != Int8) {
1687  }
1688  NestedArrayItem<1> item;
1689  Status status = getItem(index, item);
1690  if (status != Success) {
1691  return status;
1692  }
1693  if (item.nof_sizes != 0) {
1695  }
1696  if (!item.is_null) {
1697  s.assign(reinterpret_cast<const char*>(item.values), static_cast<size_t>(item.nof_values));
1698  is_null = false;
1699  }
1700  return status;
1701  }
1702 
1703  template <typename CT>
1704  Status getItem(const int64_t index,
1705  std::vector<CT>& values,
1706  std::vector<int32_t>& sizes,
1707  bool& is_null) {
1708  if constexpr (!std::is_same<CT, uint8_t>::value) {
1709  if constexpr (std::is_same<CT, double>::value) {
1710  const auto* metadata = getNestedArrayMetadata();
1711  if (metadata->value_type != PointFloat64) {
1713  }
1714  } else if constexpr (std::is_same<CT, int32_t>::value) {
1715  const auto* metadata = getNestedArrayMetadata();
1716  if (metadata->value_type != PointInt32) {
1718  }
1719  } else {
1721  }
1722  }
1723  NestedArrayItem<2> item;
1724  Status status = getItem(index, item);
1725  if (status != Success) {
1726  return status;
1727  }
1728  if (item.is_null) {
1729  return Success;
1730  }
1731  if (item.nof_sizes != 1) {
1733  }
1734  const auto valuesize = getValueSize();
1735  const auto values_count = item.nof_values * valuesize / sizeof(CT);
1736  values.reserve(values_count);
1737  values.insert(values.end(),
1738  reinterpret_cast<CT*>(item.values),
1739  reinterpret_cast<CT*>(item.values) + values_count);
1740 
1741  sizes.reserve(item.sizes_lengths[0]);
1742  sizes.insert(sizes.end(),
1743  reinterpret_cast<sizes_t*>(item.sizes_buffers[0]),
1744  reinterpret_cast<sizes_t*>(item.sizes_buffers[0] + item.sizes_lengths[0] * sizeof(int32_t)));
1745  return Success;
1746  }
1747 
1748  template <typename CT>
1749  Status getItem(const int64_t index,
1750  std::vector<CT>& values,
1751  std::vector<int32_t>& sizes,
1752  std::vector<int32_t>& sizes_of_sizes,
1753  bool& is_null) {
1754  if constexpr (!std::is_same<CT, uint8_t>::value) {
1755  if constexpr (std::is_same<CT, double>::value) {
1756  const auto* metadata = getNestedArrayMetadata();
1757  if (metadata->value_type != PointFloat64) {
1759  }
1760  } else if constexpr (std::is_same<CT, int32_t>::value) {
1761  const auto* metadata = getNestedArrayMetadata();
1762  if (metadata->value_type != PointInt32) {
1764  }
1765  } else {
1767  }
1768  }
1769  NestedArrayItem<3> item;
1770  Status status = getItem(index, item);
1771  if (status != Success) {
1772  return status;
1773  }
1774  if (item.is_null) {
1775  return Success;
1776  }
1777  if (item.nof_sizes != 2) {
1779  }
1780  const auto valuesize = getValueSize();
1781  const auto values_count = item.nof_values * valuesize / sizeof(CT);
1782  values.reserve(values_count);
1783  values.insert(values.end(),
1784  reinterpret_cast<CT*>(item.values),
1785  reinterpret_cast<CT*>(item.values) + values_count);
1786 
1787  sizes.reserve(item.sizes_lengths[1]);
1788  sizes.insert(sizes.end(),
1789  reinterpret_cast<sizes_t*>(item.sizes_buffers[1]),
1790  reinterpret_cast<sizes_t*>(item.sizes_buffers[1] + item.sizes_lengths[1] * sizeof(int32_t)));
1791 
1792  sizes_of_sizes.reserve(item.sizes_lengths[0]);
1793  sizes_of_sizes.insert(sizes_of_sizes.end(),
1794  reinterpret_cast<sizes_t*>(item.sizes_buffers[0]),
1795  reinterpret_cast<sizes_t*>(item.sizes_buffers[0] + item.sizes_lengths[0] * sizeof(int32_t)));
1796  return Success;
1797 
1798  }
1799 
1800  Status setItem(const int64_t index, const std::string& s) {
1801  const auto* metadata = getNestedArrayMetadata();
1802  if (metadata->value_type != Int8) {
1804  }
1805  return setItem<1, false>(index, reinterpret_cast<const int8_t*>(s.data()), s.size());
1806  }
1807 
1808  template <typename CT, size_t NDIM=0>
1809  Status setItem(const int64_t index, const std::vector<CT>& arr) {
1810  if (getNDims() != NDIM + 1) {
1812  }
1813  if constexpr (!std::is_same<CT, uint8_t>::value) {
1814  if constexpr (std::is_same<CT, double>::value) {
1815  const auto* metadata = getNestedArrayMetadata();
1816  if (metadata->value_type != PointFloat64) {
1818  }
1819  } else if constexpr (std::is_same<CT, int32_t>::value) {
1820  const auto* metadata = getNestedArrayMetadata();
1821  if (metadata->value_type != PointInt32) {
1823  }
1824  } else {
1826  }
1827  }
1828  const auto valuesize = getValueSize();
1829  auto sz = (arr.size() * sizeof(CT)) / valuesize;
1830  const int32_t* const sizes_buffers[1] = {nullptr};
1831  int32_t sizes_lengths[1] = {0};
1832  return setItemWorker<1, false>(index,
1833  reinterpret_cast<const int8_t*>(arr.data()),
1834  sz,
1835  sizes_buffers,
1836  sizes_lengths,
1837  0);
1838  }
1839 
1840  template <typename CT, size_t NDIM=1, bool check_sizes=false>
1841  Status setItem(const int64_t index, const std::vector<std::vector<CT>>& item) {
1842  const auto valuesize = getValueSize();
1843  std::vector<int32_t> sizes;
1844  sizes.reserve(item.size());
1845  int32_t nof_values = 0;
1846  size_t nof_elements = 0;
1847  for (const auto& subitem: item) {
1848  const auto sz = (subitem.size() * sizeof(CT)) / valuesize;
1849  sizes.push_back(sz);
1850  nof_values += sz;
1851  nof_elements += subitem.size();
1852  }
1853  std::vector<CT> flatitem;
1854  flatitem.reserve(nof_elements);
1855  for (const auto& subitem: item) {
1856  flatitem.insert(flatitem.end(), subitem.begin(), subitem.end());
1857  }
1858  return setItem<CT, NDIM, check_sizes>(index, flatitem, sizes);
1859  }
1860 
1861  template <typename CT, size_t NDIM=2, bool check_sizes=false>
1862  Status setItem(const int64_t index,
1863  const std::vector<std::vector<std::vector<CT>>>& item) {
1864  const auto valuesize = getValueSize();
1865  std::vector<int32_t> sizes_of_sizes;
1866  std::vector<int32_t> sizes;
1867  std::vector<CT> flatitem;
1868  sizes_of_sizes.reserve(item.size());
1869  size_t nof_sizes_of_sizes = 0;
1870  for (const auto& subitem: item) {
1871  sizes_of_sizes.push_back(subitem.size());
1872  nof_sizes_of_sizes += subitem.size();
1873  }
1874  sizes.reserve(nof_sizes_of_sizes);
1875  int32_t nof_values = 0;
1876  size_t nof_elements = 0;
1877  for (const auto& subitem: item) {
1878  for (const auto& subitem1: subitem) {
1879  const auto sz = (subitem1.size() * sizeof(CT)) / valuesize;
1880  sizes.push_back(sz);
1881  nof_values += sz;
1882  nof_elements += subitem1.size();
1883  }
1884  }
1885  flatitem.reserve(nof_elements);
1886  for (const auto& subitem: item) {
1887  for (const auto& subitem1: subitem) {
1888  flatitem.insert(flatitem.end(), subitem1.begin(), subitem1.end());
1889  }
1890  }
1891  return setItem<CT, NDIM, check_sizes>(index, flatitem, sizes, sizes_of_sizes);
1892  }
1893 
1894  template <typename CT, size_t NDIM=1, bool check_sizes=true>
1895  Status setItem(const int64_t index,
1896  const std::vector<CT>& values,
1897  const std::vector<int32_t>& sizes) {
1898  if (getNDims() != NDIM + 1) {
1900  }
1901  const auto* metadata = getNestedArrayMetadata();
1902  if constexpr (!std::is_same<CT, uint8_t>::value) {
1903  if constexpr (std::is_same<CT, double>::value) {
1904  if (metadata->value_type != PointFloat64) {
1906  }
1907  } else if constexpr (std::is_same<CT, int32_t>::value) {
1908  if (metadata->value_type != PointInt32) {
1910  }
1911  } else {
1913  }
1914  }
1915  const auto valuesize = getValueSize();
1916  const int32_t nof_values = (values.size() * sizeof(CT)) / valuesize;
1917  return setItem<NDIM, check_sizes>(index,
1918  reinterpret_cast<const int8_t*>(values.data()),
1919  nof_values,
1920  sizes.data(),
1921  sizes.size());
1922  }
1923 
1924  template <typename CT, size_t NDIM=2, bool check_sizes=true>
1925  Status setItem(const int64_t index,
1926  const std::vector<CT>& values,
1927  const std::vector<int32_t>& sizes,
1928  const std::vector<int32_t>& sizes_of_sizes) {
1929  if (getNDims() != NDIM + 1) {
1931  }
1932  const auto* metadata = getNestedArrayMetadata();
1933  if constexpr (!std::is_same<CT, uint8_t>::value) {
1934  if constexpr (std::is_same<CT, double>::value) {
1935  if (metadata->value_type != PointFloat64) {
1937  }
1938  } else if constexpr (std::is_same<CT, int32_t>::value) {
1939  if (metadata->value_type != PointInt32) {
1941  }
1942  } else {
1944  }
1945  }
1946  const auto valuesize = getValueSize();
1947  const auto nof_values = (values.size() * sizeof(CT)) / valuesize;
1948  return setItem<NDIM, check_sizes>(index,
1949  reinterpret_cast<const int8_t*>(values.data()),
1950  nof_values,
1951  sizes.data(),
1952  sizes.size(),
1953  sizes_of_sizes.data(),
1954  sizes_of_sizes.size()
1955  );
1956  }
1957 
1958  // Set a new item with index and size (in bytes) and initialize its
1959  // elements from source buffer. The item values will be
1960  // uninitialized when source buffer is nullptr. If dest != nullptr
1961  // then the item's buffer pointer will be stored in *dest.
1962  // To be deprecated in favor of NestedArray format
1963  Status setItemOld(const int64_t index,
1964  const int8_t* src,
1965  const int64_t size,
1966  int8_t** dest = nullptr) {
1967  if (index < 0 || index >= itemsCount()) {
1969  }
1970  switch (format()) {
1971  case GeoPointFormatId: {
1972  const int64_t itemsize = dtypeSize();
1973  if (size != itemsize) {
1975  }
1976  break;
1977  }
1978  default:
1980  }
1981  return setItemNoValidation(index, src, size, dest);
1982  }
1983 
1984  // Same as setItem but performs no input validation
1985  // To be deprecated in favor of NestedArray format
1986  Status setItemNoValidation(const int64_t index,
1987  const int8_t* src,
1988  const int64_t size,
1989  int8_t** dest) {
1990  switch (format()) {
1991  case GeoPointFormatId: {
1992  int8_t* values = get_values();
1993  const int64_t itemsize = dtypeSize();
1994  const int64_t csize = index * itemsize;
1995  if (src != nullptr && memcpy(values + csize, src, size) == nullptr) {
1997  }
1998  if (dest != nullptr) {
1999  *dest = values + csize;
2000  }
2001  break;
2002  }
2003  default:
2005  }
2006 
2007  return Success;
2008  }
2009 
2010  // Set a new item with index and size but without initializing item
2011  // elements. The buffer pointer of the new item will be stored in
2012  // *dest if dest != nullptr. Inputs are not validated!
2013  Status setEmptyItemNoValidation(int64_t index, int64_t size, int8_t** dest) {
2014  return setItemNoValidation(index, nullptr, size, dest);
2015  }
2016 
2017  HOST DEVICE bool isSpecified(int64_t index) const {
2018  if (index < 0 || index >= itemsCount()) {
2019  return false;
2020  }
2021  if (isNestedArray()) {
2022  auto* storage_indices = get_storage_indices();
2023  return storage_indices[index] >= 0;
2024  }
2025  return false;
2026  }
2027 
2028  // Set item with index as a null item
2029  HOST DEVICE Status setNull(int64_t index) {
2030  if (index < 0 || index >= itemsCount()) {
2032  }
2033  if (isNestedArray()) {
2034  auto* storage_indices = get_storage_indices();
2035  if (storage_indices[index] >= 0) {
2037  }
2038  auto* worker = getNestedArrayWorker();
2039  const auto storage_index = worker->specified_items_count;
2040  worker->specified_items_count++;
2041  storage_indices[index] = storage_index;
2042  const size_t ndims = getNDims();
2043  auto* sizes_buffer = get_sizes_buffer();
2044  auto* values_offsets = get_values_offsets();
2045  auto* sizes_offsets = get_sizes_offsets();
2046  const auto values_offset = values_offsets[storage_index];
2047  const auto sizes_offset = sizes_offsets[storage_index * ndims];
2048  sizes_buffer[sizes_offset] = 0;
2049  for (size_t i = 0; i < ndims; i++) {
2050  sizes_offsets[storage_index * ndims + i + 1] = sizes_offset + 1;
2051  }
2052  values_offsets[storage_index] = -(values_offset + 1);
2053  values_offsets[storage_index + 1] = values_offset;
2054  return Success;
2055  }
2056  // To be deprecated in favor of NestedArray format:
2057  if (index < 0 || index >= itemsCount()) {
2059  }
2060 
2061  switch (format()) {
2062  case GeoPointFormatId: {
2063  return setNullNoValidation(index);
2064  }
2065  default:
2066  break;
2067  }
2069  }
2070 
2071  // Same as setNull but performs no input validation
2072  // To be deprecated in favor of NestedArray format
2074  switch (format()) {
2075  case GeoPointFormatId: {
2076  int8_t* values = get_values();
2077  int64_t itemsize = dtypeSize();
2078  const auto* metadata = getGeoPointMetadata();
2079  if (metadata->is_geoint) {
2080  // per Geospatial/CompressionRuntime.h:is_null_point_longitude_geoint32
2081  *reinterpret_cast<uint32_t*>(values + index * itemsize) = 0x80000000U;
2082  *reinterpret_cast<uint32_t*>(values + index * itemsize + sizeof(int32_t)) =
2083  0x80000000U;
2084  } else {
2085  // per Shared/InlineNullValues.h:NULL_ARRAY_DOUBLE
2086  *reinterpret_cast<double*>(values + index * itemsize) = 2 * DBL_MIN;
2087  *reinterpret_cast<double*>(values + index * itemsize + sizeof(double)) =
2088  2 * DBL_MIN;
2089  }
2090  break;
2091  }
2092  default:
2094  }
2095  return Success;
2096  }
2097 
2098  // Check if the item is unspecified or null.
2099  HOST DEVICE Status isNull(int64_t index, bool& is_null) const {
2100  if (index < 0 || index >= itemsCount()) {
2102  }
2103  if (isNestedArray()) {
2104  const auto storage_index = get_storage_index(index);
2105  const auto* values_offsets = get_values_offsets();
2106  const auto values_offset = values_offsets[storage_index];
2107  is_null = values_offset < 0;
2108  return Success;
2109  }
2110  if (index < 0 || index >= itemsCount()) {
2112  }
2113  // To be deprecated in favor of NestedArray format:
2114  switch (format()) {
2115  case GeoPointFormatId: {
2116  const int8_t* values = get_values();
2117  const auto* metadata = getGeoPointMetadata();
2118  int64_t itemsize = dtypeSize();
2119  if (metadata->is_geoint) {
2120  // per Geospatial/CompressionRuntime.h:is_null_point_longitude_geoint32
2121  is_null = (*reinterpret_cast<const uint32_t*>(values + index * itemsize)) ==
2122  0x80000000U;
2123  } else {
2124  // per Shared/InlineNullValues.h:NULL_ARRAY_DOUBLE
2125  is_null = (*reinterpret_cast<const double*>(values + index * itemsize)) ==
2126  2 * DBL_MIN;
2127  }
2128  return Success;
2129  }
2130  default:
2131  break;
2132  }
2134  }
2135 
2136  // Get item at index by storing its size (in bytes), values buffer,
2137  // and nullity information to the corresponding pointer
2138  // arguments.
2139  // To be deprecated in favor of NestedArray format
2140  HOST DEVICE Status getItemOld(int64_t index, int64_t& size, int8_t*& dest, bool& is_null) {
2141  if (index < 0 || index >= itemsCount()) {
2143  }
2144  switch (format()) {
2145  case GeoPointFormatId: {
2146  int8_t* values = get_values();
2147  int64_t itemsize = dtypeSize();
2148  size = itemsize;
2149  dest = values + index * itemsize;
2150  is_null = false;
2151  return Success;
2152  }
2153  default:
2154  break;
2155  }
2157  }
2158 
2159  // To be deprecated in favor of NestedArray format
2160  HOST DEVICE Status getItemOld(int64_t index, size_t& size, int8_t*& dest, bool& is_null) {
2161  int64_t sz{0};
2162  Status status = getItemOld(index, sz, dest, is_null);
2163  size = sz;
2164  return status;
2165  }
2166 
2167 #ifdef HAVE_TOSTRING
2168 #define HAVE_FLATBUFFER_TOSTRING
2169  std::string bufferToString(const int8_t* buffer,
2170  const size_t size,
2171  ValueType value_type) const {
2172  size_t value_size = get_size(value_type);
2173  size_t count = size / value_size;
2174  std::string result = "";
2175  for (size_t i = 0; i < count; i++) {
2176  if (i > 0) {
2177  result += ", ";
2178  }
2179  switch (value_type) {
2180  case Bool8:
2181  result += (buffer[i] ? "true" : "false");
2182  break;
2183  case Int8:
2184  result += std::to_string(buffer[i]);
2185  break;
2186  case Int16:
2187  result += std::to_string(reinterpret_cast<const int16_t*>(buffer)[i]);
2188  break;
2189  case Int32:
2190  result += std::to_string(reinterpret_cast<const int32_t*>(buffer)[i]);
2191  break;
2192  case Int64:
2193  result += std::to_string(reinterpret_cast<const int64_t*>(buffer)[i]);
2194  break;
2195  case UInt8:
2196  result += std::to_string(reinterpret_cast<const uint8_t*>(buffer)[i]);
2197  break;
2198  case UInt16:
2199  result += std::to_string(reinterpret_cast<const uint16_t*>(buffer)[i]);
2200  break;
2201  case UInt32:
2202  result += std::to_string(reinterpret_cast<const uint32_t*>(buffer)[i]);
2203  break;
2204  case UInt64:
2205  result += std::to_string(reinterpret_cast<const uint64_t*>(buffer)[i]);
2206  break;
2207  case Float32:
2208  result += std::to_string(reinterpret_cast<const float*>(buffer)[i]);
2209  break;
2210  case Float64:
2211  result += std::to_string(reinterpret_cast<const double*>(buffer)[i]);
2212  break;
2213  case PointInt32:
2214  result += "(";
2215  if (containsNullValue(buffer + 2 * i * sizeof(int32_t))) {
2216  result += "NULL";
2217  } else {
2218  result += std::to_string(reinterpret_cast<const int32_t*>(buffer)[2 * i]);
2219  result += ", ";
2220  result += std::to_string(reinterpret_cast<const int32_t*>(buffer)[2 * i + 1]);
2221  }
2222  result += ")";
2223  break;
2224  case PointFloat64:
2225  result += "(";
2226  if (containsNullValue(buffer + 2 * i * sizeof(double))) {
2227  result += "NULL";
2228  } else {
2229  result += std::to_string(reinterpret_cast<const double*>(buffer)[2 * i]);
2230  result += ", ";
2231  result += std::to_string(reinterpret_cast<const double*>(buffer)[2 * i + 1]);
2232  }
2233  result += ")";
2234  break;
2235  }
2236  }
2237  return result;
2238  }
2239 
2240  std::string toString() const {
2241  if (buffer == nullptr) {
2242  return ::typeName(this) + "[UNINITIALIZED]";
2243  }
2244  std::string result = typeName(this) + "@" + ::toString((void*)buffer) + "(";
2245  result += "" + getBaseWorker()->toString();
2246 
2247  if (isNestedArray()) {
2248  const auto* metadata = getNestedArrayMetadata();
2249  const auto* worker = getNestedArrayWorker();
2250  result += ",\n " + metadata->toString();
2251  result += ",\n " + worker->toString();
2252  result += ",\n values_buffer=[" +
2253  bufferToString(
2254  get_values_buffer(), getValuesBufferSize(), metadata->value_type) +
2255  "]";
2256  result += ",\n sizes_buffer=[" +
2257  bufferToString(
2258  reinterpret_cast<const int8_t*>(get_sizes_buffer()),
2259  metadata->total_sizes_count * get_size(FLATBUFFER_SIZES_T_VALUE_TYPE),
2261  "]";
2262  result += ",\n values_offsets=[" +
2263  bufferToString(reinterpret_cast<const int8_t*>(get_values_offsets()),
2264  (metadata->total_items_count + 1) *
2267  "]";
2268  result += ",\n sizes_offsets=[" +
2269  bufferToString(reinterpret_cast<const int8_t*>(get_sizes_offsets()),
2270  (metadata->total_items_count * metadata->ndims + 1) *
2273  "]";
2274  result += ",\n storage_indices=[" +
2275  bufferToString(
2276  reinterpret_cast<const int8_t*>(get_storage_indices()),
2277  metadata->total_items_count * get_size(FLATBUFFER_SIZES_T_VALUE_TYPE),
2279  "]";
2280  result += ",\n user_data_buffer=[" +
2281  bufferToString(get_user_data_buffer(), metadata->user_data_size, Int8) +
2282  "]";
2283  result += ")";
2284  return result;
2285  }
2286 
2287  // To be deprecated in favor of NestedArray format:
2288  const FlatBufferFormat fmt = format();
2289 
2290  std::cout << "fmt=" << static_cast<int64_t>(fmt) << ", " << sizeof(fmt) << std::endl;
2291  switch (fmt) {
2292  case GeoPointFormatId: {
2293  result += ", " + getGeoPointMetadata()->toString();
2294  result += ", " + getGeoPointWorker()->toString();
2295  break;
2296  }
2297  default:
2298  break;
2299  }
2300 
2301  switch (fmt) {
2302  case GeoPointFormatId: {
2303  const auto* metadata = getGeoPointMetadata();
2304  result += ", point data=";
2305  int64_t numitems = itemsCount();
2306  if (metadata->is_geoint) {
2307  const int32_t* values_buf = reinterpret_cast<const int32_t*>(get_values());
2308  std::vector<int32_t> values(values_buf, values_buf + numitems * 2);
2309  result += ::toString(values);
2310  } else {
2311  const double* values_buf = reinterpret_cast<const double*>(get_values());
2312  std::vector<double> values(values_buf, values_buf + numitems * 2);
2313  result += ::toString(values);
2314  }
2315  return result + ")";
2316  }
2317  default:
2318  break;
2319  }
2320  return ::typeName(this) + "[UNKNOWN FORMAT]";
2321  }
2322 #endif
2323 };
2324 
2325 #ifdef HAVE_TOSTRING
2326 inline std::ostream& operator<<(std::ostream& os,
2328  switch (type) {
2330  os << "Bool8";
2331  break;
2333  os << "Int8";
2334  break;
2336  os << "Int16";
2337  break;
2339  os << "Int32";
2340  break;
2342  os << "Int64";
2343  break;
2345  os << "UInt8";
2346  break;
2348  os << "UInt16";
2349  break;
2351  os << "UInt32";
2352  break;
2354  os << "UInt64";
2355  break;
2357  os << "Float32";
2358  break;
2360  os << "Float64";
2361  break;
2363  os << "PointInt32";
2364  break;
2366  os << "PointFloat64";
2367  break;
2368  }
2369  return os;
2370 }
2371 
2372 inline std::string FlatBufferManager::toString(const FlatBufferManager::ValueType& type) {
2373  std::ostringstream ss;
2374  ss << type;
2375  return ss.str();
2376 }
2377 
2378 inline std::string toString(const FlatBufferManager::ValueType& type) {
2379  std::ostringstream ss;
2380  ss << type;
2381  return ss.str();
2382 }
2383 
2384 inline std::ostream& operator<<(std::ostream& os,
2385  FlatBufferManager::Status const status) {
2386  switch (status) {
2388  os << "Success";
2389  break;
2391  os << "IndexError";
2392  break;
2394  os << "SubIndexError";
2395  break;
2397  os << "SizeError";
2398  break;
2400  os << "FlatbufferSizeError";
2401  break;
2403  os << "ItemAlreadySpecifiedError";
2404  break;
2406  os << "ItemUnspecifiedError";
2407  break;
2409  os << "UnexpectedNullItemError";
2410  break;
2412  os << "ValuesBufferTooSmallError";
2413  break;
2415  os << "SizesBufferTooSmallError";
2416  break;
2418  os << "CompressedIndices2BufferTooSmallError";
2419  break;
2421  os << "MemoryError";
2422  break;
2424  os << "UnknownFormatError";
2425  break;
2427  os << "NotSupportedFormatError";
2428  break;
2430  os << "NotImplementedError";
2431  break;
2433  os << "InvalidUserDataError";
2434  break;
2436  os << "DimensionalityError";
2437  break;
2439  os << "UserDataError";
2440  break;
2442  os << "TypeError";
2443  break;
2445  os << "InconsistentSizesError";
2446  break;
2447  default:
2448  os << "[Unknown FlatBufferManager::Status value]";
2449  }
2450  return os;
2451 }
2452 
2453 inline std::string toString(const FlatBufferManager::Status& status) {
2454  std::ostringstream ss;
2455  ss << status;
2456  return ss.str();
2457 }
2458 #endif
2459 
2460 #undef RETURN_ERROR
HOST DEVICE size_t getNDims() const
Definition: FlatBuffer.h:866
static size_t get_size(ValueType type)
Definition: FlatBuffer.h:355
FLATBUFFER_MANAGER_FORMAT_TOOLS(GeoPoint)
#define FLATBUFFER_OFFSETS_T_VALUE_TYPE
Definition: FlatBuffer.h:395
HOST DEVICE size_t getValuesBufferSize() const
Definition: FlatBuffer.h:575
HOST DEVICE Status getItemWorker(const int64_t index[NDIM], const size_t n, int8_t *&values, int32_t &nof_values, int32_t *sizes_buffers[NDIM], int32_t sizes_lengths[NDIM], int32_t &nof_sizes, bool &is_null)
Definition: FlatBuffer.h:1141
HOST DEVICE bool isNestedArray() const
Definition: FlatBuffer.h:567
HOST DEVICE Status concatItem(const int64_t index, const NestedArrayItem< NDIM > &item)
Definition: FlatBuffer.h:1669
HOST DEVICE int64_t dtypeSize() const
Definition: FlatBuffer.h:628
#define FLATBUFFER_UNREACHABLE()
Definition: FlatBuffer.h:316
HOST DEVICE bool containsNullValue(const int8_t *value_ptr) const
Definition: FlatBuffer.h:994
HOST DEVICE Status setNullNoValidation(int64_t index)
Definition: FlatBuffer.h:2073
std::ostream & operator<<(std::ostream &os, const SessionInfo &session_info)
Definition: SessionInfo.cpp:57
Status setItem(const int64_t index, const std::vector< std::vector< std::vector< CT >>> &item)
Definition: FlatBuffer.h:1862
HOST DEVICE Status getItemOld(int64_t index, int64_t &size, int8_t *&dest, bool &is_null)
Definition: FlatBuffer.h:2140
HOST DEVICE size_t getValueSize() const
Definition: FlatBuffer.h:571
HOST DEVICE const int8_t * getNullValuePtr() const
Definition: FlatBuffer.h:987
HOST DEVICE FlatBufferFormat format() const
Definition: FlatBuffer.h:598
std::string toString(const QueryDescriptionType &type)
Definition: Types.h:64
HOST DEVICE const sizes_t * get_storage_indices() const
Definition: FlatBuffer.h:1033
#define RETURN_ERROR(exc)
Definition: FlatBuffer.h:299
std::string to_string(char const *&&v)
HOST DEVICE Status setItemWorker(const int64_t index, const int8_t *values, const int32_t nof_values, const int32_t *const sizes_buffers[NDIM], const int32_t sizes_lengths[NDIM], const int32_t nof_sizes)
Definition: FlatBuffer.h:1368
#define DEVICE
Status setItem(const int64_t index, const std::vector< std::vector< CT >> &item)
Definition: FlatBuffer.h:1841
HOST DEVICE sizes_t get_storage_index(const int64_t index) const
Definition: FlatBuffer.h:1045
HOST DEVICE Status getItem(const int64_t index, NestedArrayItem< NDIM > &result)
Definition: FlatBuffer.h:1349
#define HOST
void initialize(FlatBufferFormat format_id, const int8_t *format_metadata_ptr)
Definition: FlatBuffer.h:833
HOST DEVICE Status getLength(const int64_t index, size_t &length)
Definition: FlatBuffer.h:1066
FlatBufferFormat format_id
Definition: FlatBuffer.h:398
CONSTEXPR DEVICE bool is_null(const T &value)
HOST DEVICE const int8_t * get_values() const
Definition: FlatBuffer.h:943
const int64_t & get_storage_count() const
Definition: FlatBuffer.h:900
int64_t get_max_nof_values() const
Definition: FlatBuffer.h:876
HOST DEVICE Status concatItemWorker(const int64_t index, const int8_t *values, const int32_t nof_values, const int32_t *const sizes_buffers[NDIM], const int32_t sizes_lengths[NDIM], const int32_t nof_sizes)
Definition: FlatBuffer.h:1498
HOST DEVICE sizes_t * get_storage_indices()
Definition: FlatBuffer.h:1021
HOST DEVICE bool isSpecified(int64_t index) const
Definition: FlatBuffer.h:2017
Status getItem(const int64_t index, std::string &s, bool &is_null)
Definition: FlatBuffer.h:1682
HOST DEVICE Status setItem(const int64_t index, const int8_t *values_buf, const int32_t nof_values)
Definition: FlatBuffer.h:1593
Status setItemOld(const int64_t index, const int8_t *src, const int64_t size, int8_t **dest=nullptr)
Definition: FlatBuffer.h:1963
HOST DEVICE Status getItem(const int64_t index[NDIM], const size_t n, NestedArrayItem< NDIM > &result)
Definition: FlatBuffer.h:1355
Status setItem(const int64_t index, const std::string &s)
Definition: FlatBuffer.h:1800
HOST DEVICE int64_t itemsCount() const
Definition: FlatBuffer.h:604
int64_t getBufferSize() const
Definition: FlatBuffer.h:563
HOST DEVICE int8_t * get_values()
Definition: FlatBuffer.h:930
Status getItem(const int64_t index, std::vector< CT > &values, std::vector< int32_t > &sizes, std::vector< int32_t > &sizes_of_sizes, bool &is_null)
Definition: FlatBuffer.h:1749
static int64_t compute_flatbuffer_size(FlatBufferFormat format_id, const int8_t *format_metadata_ptr)
Definition: FlatBuffer.h:639
FLATBUFFER_GET_BUFFER_METHODS(user_data_buffer, int8_t)
HOST DEVICE const int8_t * getValuesBuffer() const
Definition: FlatBuffer.h:581
HOST DEVICE Status getItemOld(int64_t index, size_t &size, int8_t *&dest, bool &is_null)
Definition: FlatBuffer.h:2160
Status setEmptyItemNoValidation(int64_t index, int64_t size, int8_t **dest)
Definition: FlatBuffer.h:2013
int64_t get_values_buffer_size() const
Definition: FlatBuffer.h:913
static int64_t computeBufferSizeNestedArray(int64_t ndims, int64_t total_items_count, int64_t total_sizes_count, int64_t total_values_count, ValueType value_type, size_t user_data_size)
Definition: FlatBuffer.h:718
HOST DEVICE Status getLength(const int64_t index[NDIM], const size_t n, size_t &length) const
Definition: FlatBuffer.h:1074
HOST DEVICE BaseWorker * getBaseWorker()
Definition: FlatBuffer.h:661
std::string typeName(const T *v)
Definition: toString.h:106
Status initializeNestedArray(int64_t ndims, int64_t total_items_count, int64_t total_sizes_count, int64_t total_values_count, ValueType value_type, const int8_t *null_value_ptr, const int8_t *user_data_ptr, size_t user_data_size)
Definition: FlatBuffer.h:743
HOST DEVICE Status setItem(const int64_t index, const int8_t *values_buf, const int32_t nof_values, const int32_t *sizes_buf, const int32_t nof_sizes)
Definition: FlatBuffer.h:1607
Status setItem(const int64_t index, const std::vector< CT > &values, const std::vector< int32_t > &sizes, const std::vector< int32_t > &sizes_of_sizes)
Definition: FlatBuffer.h:1925
#define FLATBUFFER_MANAGER_SET_OFFSET(OBJ, NAME, SIZE)
Definition: FlatBuffer.h:714
Status setItem(const int64_t index, const std::vector< CT > &values, const std::vector< int32_t > &sizes)
Definition: FlatBuffer.h:1895
int64_t _align_to_int64(int64_t addr)
Definition: FlatBuffer.h:329
HOST DEVICE Status setItem(const int64_t index, const int8_t *values_buf, const int32_t nof_values, const int32_t *sizes_buf, const int32_t nof_sizes, const int32_t *sizes_of_sizes_buf, const int32_t nof_sizes_of_sizes)
Definition: FlatBuffer.h:1623
HOST DEVICE int64_t & get_storage_count()
Definition: FlatBuffer.h:888
HOST DEVICE int64_t valueByteSize() const
Definition: FlatBuffer.h:617
size_t getValuesCount() const
Definition: FlatBuffer.h:586
HOST DEVICE Status isNull(int64_t index, bool &is_null) const
Definition: FlatBuffer.h:2099
#define FLATBUFFER_SIZES_T_VALUE_TYPE
Definition: FlatBuffer.h:394
HOST DEVICE Status concatItem(const int64_t index, const int8_t *values_buf, const int32_t nof_values)
Definition: FlatBuffer.h:1655
constexpr double n
Definition: Utm.h:38
HOST DEVICE Status isNull(const int64_t index[NDIM], const size_t n, bool &is_null)
Definition: FlatBuffer.h:1052
HOST DEVICE const BaseWorker * getBaseWorker() const
Definition: FlatBuffer.h:664
HOST static DEVICE bool isFlatBuffer(const void *buffer)
Definition: FlatBuffer.h:528
Status getItem(const int64_t index, std::vector< CT > &values, std::vector< int32_t > &sizes, bool &is_null)
Definition: FlatBuffer.h:1704
HOST DEVICE Status setNull(int64_t index)
Definition: FlatBuffer.h:2029
HOST DEVICE Status setItem(const int64_t index, const NestedArrayItem< NDIM > &item)
Definition: FlatBuffer.h:1641
FlatBufferFormat
Definition: FlatBuffer.h:324
Status setItem(const int64_t index, const std::vector< CT > &arr)
Definition: FlatBuffer.h:1809
static int64_t getBufferSize(const void *buffer)
Definition: FlatBuffer.h:553
Status setItemNoValidation(const int64_t index, const int8_t *src, const int64_t size, int8_t **dest)
Definition: FlatBuffer.h:1986