OmniSciDB  04ee39c94c
PopulateTableRandom.h File Reference

Populate a table with random data. More...

#include <cstdlib>
#include <string>
#include <vector>
#include "../Catalog/Catalog.h"
+ Include dependency graph for PopulateTableRandom.h:
+ This graph shows which files directly or indirectly include this file:

Go to the source code of this file.

Functions

std::vector< size_t > populate_table_random (const std::string &table_name, const size_t num_rows, const Catalog_Namespace::Catalog &cat)
 

Detailed Description

Populate a table with random data.

Author
Wei Hong wei@m.nosp@m.ap-d.nosp@m..com Copyright (c) 2014 MapD Technologies, Inc. All rights reserved.

Definition in file PopulateTableRandom.h.

Function Documentation

◆ populate_table_random()

std::vector<size_t> populate_table_random ( const std::string &  table_name,
const size_t  num_rows,
const Catalog_Namespace::Catalog cat 
)

Definition at line 279 of file PopulateTableRandom.cpp.

References CHECK, Fragmenter_Namespace::InsertData::columnIds, Fragmenter_Namespace::InsertData::data, Fragmenter_Namespace::InsertData::databaseId, Catalog_Namespace::DBMetadata::dbId, measure< TimeT >::execution(), TableDescriptor::fragmenter, Catalog_Namespace::Catalog::getAllColumnMetadataForTable(), Catalog_Namespace::Catalog::getCurrentDB(), Catalog_Namespace::Catalog::getMetadataForTable(), Fragmenter_Namespace::AbstractFragmenter::insertData(), kENCODING_NONE, num_rows, Fragmenter_Namespace::InsertData::numRows, random_fill(), DataBlockPtr::stringsPtr, TableDescriptor::tableId, and Fragmenter_Namespace::InsertData::tableId.

Referenced by anonymous_namespace{StoragePerfTest.cpp}::load_data_for_thread_test_2(), anonymous_namespace{StoragePerfTest.cpp}::load_data_test(), anonymous_namespace{StorageTest.cpp}::simple_thread_wrapper(), and anonymous_namespace{StorageTest.cpp}::storage_test().

281  {
282  const TableDescriptor* td = cat.getMetadataForTable(table_name);
283  const auto cds = cat.getAllColumnMetadataForTable(td->tableId, false, false, false);
284  InsertData insert_data;
285  insert_data.databaseId = cat.getCurrentDB().dbId;
286  insert_data.tableId = td->tableId;
287  for (const auto& cd : cds) {
288  insert_data.columnIds.push_back(cd->columnId);
289  }
290  insert_data.numRows = num_rows;
291  std::vector<std::vector<int8_t>> numbers_vec;
292  std::vector<std::unique_ptr<std::vector<std::string>>> strings_vec;
293 
294  DataBlockPtr p{0};
295  // now allocate space for insert data
296  for (auto cd : cds) {
297  if (cd->columnType.is_varlen()) {
298  if (cd->columnType.get_compression() == kENCODING_NONE) {
299  strings_vec.push_back(std::make_unique<std::vector<std::string>>(num_rows));
300  p.stringsPtr = strings_vec.back().get();
301  } else {
302  CHECK(false);
303  }
304  } else {
305  numbers_vec.emplace_back(num_rows * cd->columnType.get_logical_size());
306  p.numbersPtr = numbers_vec.back().data();
307  }
308  insert_data.data.push_back(p);
309  }
310 
311  // fill InsertData with random data
312  std::vector<size_t> col_hashs(
313  cds.size()); // compute one hash per column for the generated data
314  int i = 0;
315  size_t data_volumn = 0;
316  for (auto cd : cds) {
317  col_hashs[i] = random_fill(cd, insert_data.data[i], num_rows, data_volumn);
318  i++;
319  }
320 
321  // now load the data into table
322  auto ms = measure<>::execution([&]() { td->fragmenter->insertData(insert_data); });
323  std::cout << "Loaded " << num_rows << " rows " << data_volumn << " bytes in " << ms
324  << " ms. at " << (double)data_volumn / (ms / 1000.0) / 1e6 << " MB/sec."
325  << std::endl;
326 
327  return col_hashs;
328 }
const int8_t const int64_t * num_rows
std::vector< std::string > * stringsPtr
Definition: sqltypes.h:138
const TableDescriptor * getMetadataForTable(const std::string &tableName, const bool populateFragmenter=true) const
Returns a pointer to a const TableDescriptor struct matching the provided tableName.
size_t random_fill(const ColumnDescriptor *cd, DataBlockPtr p, size_t num_elems, size_t &data_volumn)
int tableId
identifies the database into which the data is being inserted
Definition: Fragmenter.h:61
size_t numRows
a vector of column ids for the row(s) being inserted
Definition: Fragmenter.h:63
const DBMetadata & getCurrentDB() const
Definition: Catalog.h:176
std::vector< DataBlockPtr > data
the number of rows being inserted
Definition: Fragmenter.h:64
std::list< const ColumnDescriptor * > getAllColumnMetadataForTable(const int tableId, const bool fetchSystemColumns, const bool fetchVirtualColumns, const bool fetchPhysicalColumns) const
Returns a list of pointers to constant ColumnDescriptor structs for all the columns from a particular...
Definition: Catalog.cpp:1579
#define CHECK(condition)
Definition: Logger.h:187
The data to be inserted using the fragment manager.
Definition: Fragmenter.h:59
static TimeT::rep execution(F func, Args &&... args)
Definition: sample.cpp:29
specifies the content in-memory of a row in the table metadata table
Fragmenter_Namespace::AbstractFragmenter * fragmenter
std::vector< int > columnIds
identifies the table into which the data is being inserted
Definition: Fragmenter.h:62
virtual void insertData(InsertData &insertDataStruct)=0
Given data wrapped in an InsertData struct, inserts it into the correct partitions with locks and che...
+ Here is the call graph for this function:
+ Here is the caller graph for this function: