OmniSciDB  ba1bac9284
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Macros Pages
RelAlgDagBuilder Class Reference

#include <RelAlgDagBuilder.h>

+ Inheritance diagram for RelAlgDagBuilder:
+ Collaboration diagram for RelAlgDagBuilder:

Public Member Functions

 RelAlgDagBuilder ()=delete
 
 RelAlgDagBuilder (const std::string &query_ra, const Catalog_Namespace::Catalog &cat, const RenderInfo *render_info)
 
 RelAlgDagBuilder (RelAlgDagBuilder &root_dag_builder, const rapidjson::Value &query_ast, const Catalog_Namespace::Catalog &cat, const RenderInfo *render_opts)
 
void eachNode (std::function< void(RelAlgNode const *)> const &) const
 
const RelAlgNodegetRootNode () const
 
std::shared_ptr< const RelAlgNodegetRootNodeShPtr () const
 
void registerSubquery (std::shared_ptr< RexSubQuery > subquery)
 
const std::vector
< std::shared_ptr< RexSubQuery > > & 
getSubqueries () const
 
void registerQueryHints (Hints *hints_delivered)
 
const RegisteredQueryHint getQueryHints () const
 
void resetQueryExecutionState ()
 

Private Member Functions

void build (const rapidjson::Value &query_ast, RelAlgDagBuilder &root_dag_builder)
 

Private Attributes

const Catalog_Namespace::Catalogcat_
 
std::vector< std::shared_ptr
< RelAlgNode > > 
nodes_
 
std::vector< std::shared_ptr
< RexSubQuery > > 
subqueries_
 
const RenderInforender_info_
 
RegisteredQueryHint query_hint_
 

Detailed Description

Builder class to create an in-memory, easy-to-navigate relational algebra DAG interpreted from a JSON representation from Calcite. Also, applies high level optimizations which can be expressed through relational algebra extended with RelCompound. The RelCompound node is an equivalent representation for sequences of RelFilter, RelProject and RelAggregate nodes. This coalescing minimizes the amount of intermediate buffers required to evaluate a query. Lower level optimizations are taken care by lower levels, mainly RelAlgTranslator and the IR code generation.

Definition at line 1721 of file RelAlgDagBuilder.h.

Constructor & Destructor Documentation

RelAlgDagBuilder::RelAlgDagBuilder ( )
delete
RelAlgDagBuilder::RelAlgDagBuilder ( const std::string &  query_ra,
const Catalog_Namespace::Catalog cat,
const RenderInfo render_info 
)

Constructs a RelAlg DAG from a JSON representation.

Parameters
query_raA JSON string representation of an RA tree from Calcite.
catDB catalog for the current user.
render_optsAdditional build options for render queries.

Definition at line 2607 of file RelAlgDagBuilder.cpp.

References build(), CHECK, logger::ERROR, LOG, RelAlgNode::resetRelAlgFirstId(), and VLOG.

2610  : cat_(cat), render_info_(render_info), query_hint_(RegisteredQueryHint::defaults()) {
2611  rapidjson::Document query_ast;
2612  query_ast.Parse(query_ra.c_str());
2613  VLOG(2) << "Parsing query RA JSON: " << query_ra;
2614  if (query_ast.HasParseError()) {
2615  query_ast.GetParseError();
2616  LOG(ERROR) << "Failed to parse RA tree from Calcite (offset "
2617  << query_ast.GetErrorOffset() << "):\n"
2618  << rapidjson::GetParseError_En(query_ast.GetParseError());
2619  VLOG(1) << "Failed to parse query RA: " << query_ra;
2620  throw std::runtime_error(
2621  "Failed to parse relational algebra tree. Possible query syntax error.");
2622  }
2623  CHECK(query_ast.IsObject());
2625  build(query_ast, *this);
2626 }
#define LOG(tag)
Definition: Logger.h:200
void build(const rapidjson::Value &query_ast, RelAlgDagBuilder &root_dag_builder)
const Catalog_Namespace::Catalog & cat_
const RenderInfo * render_info_
static RegisteredQueryHint defaults()
Definition: QueryHint.h:175
RegisteredQueryHint query_hint_
#define CHECK(condition)
Definition: Logger.h:206
#define VLOG(n)
Definition: Logger.h:300
static void resetRelAlgFirstId() noexcept

+ Here is the call graph for this function:

RelAlgDagBuilder::RelAlgDagBuilder ( RelAlgDagBuilder root_dag_builder,
const rapidjson::Value &  query_ast,
const Catalog_Namespace::Catalog cat,
const RenderInfo render_opts 
)

Constructs a sub-DAG for any subqueries. Should only be called during DAG building.

Parameters
root_dag_builderThe root DAG builder. The root stores pointers to all subqueries.
query_astThe current JSON node to build a DAG for.
catDB catalog for the current user.
render_optsAdditional build options for render queries.

Definition at line 2628 of file RelAlgDagBuilder.cpp.

References build().

2632  : cat_(cat), render_info_(render_info), query_hint_(RegisteredQueryHint::defaults()) {
2633  build(query_ast, root_dag_builder);
2634 }
void build(const rapidjson::Value &query_ast, RelAlgDagBuilder &root_dag_builder)
const Catalog_Namespace::Catalog & cat_
const RenderInfo * render_info_
static RegisteredQueryHint defaults()
Definition: QueryHint.h:175
RegisteredQueryHint query_hint_

+ Here is the call graph for this function:

Member Function Documentation

void RelAlgDagBuilder::build ( const rapidjson::Value &  query_ast,
RelAlgDagBuilder root_dag_builder 
)
private

Definition at line 2636 of file RelAlgDagBuilder.cpp.

References anonymous_namespace{RelAlgDagBuilder.cpp}::add_window_function_pre_project(), alterRAForRender(), anonymous_namespace{RelAlgDagBuilder.cpp}::bind_inputs(), cat_, CHECK, anonymous_namespace{RelAlgDagBuilder.cpp}::coalesce_nodes(), anonymous_namespace{RelLeftDeepInnerJoin.cpp}::create_left_deep_join(), eliminate_dead_columns(), eliminate_dead_subqueries(), eliminate_identical_copy(), field(), fold_filters(), g_cluster, get_left_deep_join_root(), anonymous_namespace{RelAlgDagBuilder.cpp}::handleQueryHint(), hoist_filter_cond_to_cross_join(), anonymous_namespace{RelAlgDagBuilder.cpp}::mark_nops(), nodes_, render_info_, details::RelAlgDispatcher::run(), anonymous_namespace{RelAlgDagBuilder.cpp}::separate_window_function_expressions(), simplify_sort(), sink_projected_boolean_expr_to_join(), and subqueries_.

Referenced by RelAlgDagBuilder().

2637  {
2638  const auto& rels = field(query_ast, "rels");
2639  CHECK(rels.IsArray());
2640  try {
2641  nodes_ = details::RelAlgDispatcher(cat_).run(rels, lead_dag_builder);
2642  } catch (const QueryNotSupported&) {
2643  throw;
2644  }
2645  CHECK(!nodes_.empty());
2647 
2648  if (render_info_) {
2649  // Alter the RA for render. Do this before any flattening/optimizations are done to
2650  // the tree.
2652  }
2653 
2654  handleQueryHint(nodes_, this);
2655  mark_nops(nodes_);
2660  std::vector<const RelAlgNode*> filtered_left_deep_joins;
2661  std::vector<const RelAlgNode*> left_deep_joins;
2662  for (const auto& node : nodes_) {
2663  const auto left_deep_join_root = get_left_deep_join_root(node);
2664  // The filter which starts a left-deep join pattern must not be coalesced
2665  // since it contains (part of) the join condition.
2666  if (left_deep_join_root) {
2667  left_deep_joins.push_back(left_deep_join_root.get());
2668  if (std::dynamic_pointer_cast<const RelFilter>(left_deep_join_root)) {
2669  filtered_left_deep_joins.push_back(left_deep_join_root.get());
2670  }
2671  }
2672  }
2673  if (filtered_left_deep_joins.empty()) {
2675  }
2676  eliminate_dead_columns(nodes_);
2677  eliminate_dead_subqueries(subqueries_, nodes_.back().get());
2679  if (g_cluster) {
2681  }
2682  coalesce_nodes(nodes_, left_deep_joins);
2683  CHECK(nodes_.back().use_count() == 1);
2684  create_left_deep_join(nodes_);
2685 }
void handleQueryHint(const std::vector< std::shared_ptr< RelAlgNode >> &nodes, RelAlgDagBuilder *dag_builder) noexcept
void bind_inputs(const std::vector< std::shared_ptr< RelAlgNode >> &nodes) noexcept
void hoist_filter_cond_to_cross_join(std::vector< std::shared_ptr< RelAlgNode >> &nodes) noexcept
std::shared_ptr< const RelAlgNode > get_left_deep_join_root(const std::shared_ptr< RelAlgNode > &node)
void sink_projected_boolean_expr_to_join(std::vector< std::shared_ptr< RelAlgNode >> &nodes) noexcept
void eliminate_identical_copy(std::vector< std::shared_ptr< RelAlgNode >> &nodes) noexcept
std::pair< std::shared_ptr< RelLeftDeepInnerJoin >, std::shared_ptr< const RelAlgNode > > create_left_deep_join(const std::shared_ptr< RelAlgNode > &left_deep_join_root)
const Catalog_Namespace::Catalog & cat_
void simplify_sort(std::vector< std::shared_ptr< RelAlgNode >> &nodes) noexcept
std::vector< std::shared_ptr< RexSubQuery > > subqueries_
const RenderInfo * render_info_
const rapidjson::Value & field(const rapidjson::Value &obj, const char field[]) noexcept
Definition: JsonAccessors.h:31
std::vector< std::shared_ptr< RelAlgNode > > nodes_
void add_window_function_pre_project(std::vector< std::shared_ptr< RelAlgNode >> &nodes)
void separate_window_function_expressions(std::vector< std::shared_ptr< RelAlgNode >> &nodes)
void mark_nops(const std::vector< std::shared_ptr< RelAlgNode >> &nodes) noexcept
#define CHECK(condition)
Definition: Logger.h:206
bool g_cluster
void alterRAForRender(std::vector< std::shared_ptr< RelAlgNode >> &nodes, const RenderInfo &render_info)
std::vector< std::shared_ptr< RelAlgNode > > run(const rapidjson::Value &rels, RelAlgDagBuilder &root_dag_builder)
void fold_filters(std::vector< std::shared_ptr< RelAlgNode >> &nodes) noexcept
void eliminate_dead_subqueries(std::vector< std::shared_ptr< RexSubQuery >> &subqueries, RelAlgNode const *root)
void eliminate_dead_columns(std::vector< std::shared_ptr< RelAlgNode >> &nodes) noexcept
void coalesce_nodes(std::vector< std::shared_ptr< RelAlgNode >> &nodes, const std::vector< const RelAlgNode * > &left_deep_joins)

+ Here is the call graph for this function:

+ Here is the caller graph for this function:

void RelAlgDagBuilder::eachNode ( std::function< void(RelAlgNode const *)> const callback) const

Definition at line 2687 of file RelAlgDagBuilder.cpp.

References nodes_.

2688  {
2689  for (auto const& node : nodes_) {
2690  if (node) {
2691  callback(node.get());
2692  }
2693  }
2694 }
std::vector< std::shared_ptr< RelAlgNode > > nodes_
const RegisteredQueryHint RelAlgDagBuilder::getQueryHints ( ) const
inline

Definition at line 1851 of file RelAlgDagBuilder.h.

References query_hint_.

1851 { return query_hint_; }
RegisteredQueryHint query_hint_
const RelAlgNode& RelAlgDagBuilder::getRootNode ( ) const
inline

Returns the root node of the DAG.

Definition at line 1754 of file RelAlgDagBuilder.h.

References CHECK, and nodes_.

1754  {
1755  CHECK(nodes_.size());
1756  const auto& last_ptr = nodes_.back();
1757  CHECK(last_ptr);
1758  return *last_ptr;
1759  }
std::vector< std::shared_ptr< RelAlgNode > > nodes_
#define CHECK(condition)
Definition: Logger.h:206
std::shared_ptr<const RelAlgNode> RelAlgDagBuilder::getRootNodeShPtr ( ) const
inline

Definition at line 1761 of file RelAlgDagBuilder.h.

References CHECK, and nodes_.

Referenced by anonymous_namespace{RelAlgDagBuilder.cpp}::parse_subquery().

1761  {
1762  CHECK(nodes_.size());
1763  return nodes_.back();
1764  }
std::vector< std::shared_ptr< RelAlgNode > > nodes_
#define CHECK(condition)
Definition: Logger.h:206

+ Here is the caller graph for this function:

const std::vector<std::shared_ptr<RexSubQuery> >& RelAlgDagBuilder::getSubqueries ( ) const
inline

Gets all registered subqueries. Only the root DAG can contain subqueries.

Definition at line 1777 of file RelAlgDagBuilder.h.

References subqueries_.

1777  {
1778  return subqueries_;
1779  }
std::vector< std::shared_ptr< RexSubQuery > > subqueries_
void RelAlgDagBuilder::registerQueryHints ( Hints hints_delivered)
inline

Definition at line 1781 of file RelAlgDagBuilder.h.

References CHECK, RegisteredQueryHint::cpu_mode, kCpuMode, kOverlapsAllowGpuBuild, kOverlapsBucketThreshold, kOverlapsKeysPerBin, kOverlapsMaxSize, kOverlapsNoCache, RegisteredQueryHint::overlaps_allow_gpu_build, RegisteredQueryHint::overlaps_bucket_threshold, RegisteredQueryHint::overlaps_keys_per_bin, RegisteredQueryHint::overlaps_max_size, RegisteredQueryHint::overlaps_no_cache, query_hint_, RegisteredQueryHint::registerHint(), and VLOG.

1781  {
1782  for (auto it = hints_delivered->begin(); it != hints_delivered->end(); it++) {
1783  auto target = it->second;
1784  auto hint_type = it->first;
1785  switch (hint_type) {
1786  case QueryHint::kCpuMode: {
1788  query_hint_.cpu_mode = true;
1789  VLOG(1) << "A user forces to run the query on the CPU execution mode";
1790  break;
1791  }
1793  CHECK(target.getListOptions().size() == 1);
1794  double overlaps_bucket_threshold = std::stod(target.getListOptions()[0]);
1795  if (overlaps_bucket_threshold >= 0.0 && overlaps_bucket_threshold <= 90.0) {
1797  query_hint_.overlaps_bucket_threshold = overlaps_bucket_threshold;
1798  } else {
1799  VLOG(1) << "Skip the given query hint \"overlaps_bucket_threshold\" ("
1800  << overlaps_bucket_threshold
1801  << ") : the hint value should be within 0.0 ~ 90.0";
1802  }
1803  break;
1804  }
1806  CHECK(target.getListOptions().size() == 1);
1807  std::stringstream ss(target.getListOptions()[0]);
1808  int overlaps_max_size;
1809  ss >> overlaps_max_size;
1810  if (overlaps_max_size >= 0) {
1812  query_hint_.overlaps_max_size = (size_t)overlaps_max_size;
1813  } else {
1814  VLOG(1) << "Skip the query hint \"overlaps_max_size\" (" << overlaps_max_size
1815  << ") : the hint value should be larger than or equal to zero";
1816  }
1817  break;
1818  }
1822  VLOG(1) << "Allowing GPU hash table build for overlaps join.";
1823  break;
1824  }
1828  VLOG(1) << "Skip auto tuner and hashtable caching for overlaps join.";
1829  break;
1830  }
1832  CHECK(target.getListOptions().size() == 1);
1833  double overlaps_keys_per_bin = std::stod(target.getListOptions()[0]);
1834  if (overlaps_keys_per_bin > 0.0 &&
1835  overlaps_keys_per_bin < std::numeric_limits<double>::max()) {
1837  query_hint_.overlaps_keys_per_bin = overlaps_keys_per_bin;
1838  } else {
1839  VLOG(1) << "Skip the given query hint \"overlaps_keys_per_bin\" ("
1840  << overlaps_keys_per_bin
1841  << ") : the hint value should be larger than zero";
1842  }
1843  break;
1844  }
1845  default:
1846  break;
1847  }
1848  }
1849  }
bool overlaps_allow_gpu_build
Definition: QueryHint.h:169
double overlaps_keys_per_bin
Definition: QueryHint.h:171
void registerHint(const QueryHint hint)
Definition: QueryHint.h:196
size_t overlaps_max_size
Definition: QueryHint.h:168
RegisteredQueryHint query_hint_
#define CHECK(condition)
Definition: Logger.h:206
double overlaps_bucket_threshold
Definition: QueryHint.h:167
#define VLOG(n)
Definition: Logger.h:300

+ Here is the call graph for this function:

void RelAlgDagBuilder::registerSubquery ( std::shared_ptr< RexSubQuery subquery)
inline

Registers a subquery with a root DAG builder. Should only be called during DAG building and registration should only occur on the root.

Definition at line 1770 of file RelAlgDagBuilder.h.

References subqueries_.

Referenced by anonymous_namespace{RelAlgDagBuilder.cpp}::parse_subquery().

1770  {
1771  subqueries_.push_back(subquery);
1772  }
std::vector< std::shared_ptr< RexSubQuery > > subqueries_

+ Here is the caller graph for this function:

void RelAlgDagBuilder::resetQueryExecutionState ( )

Gets all registered subqueries. Only the root DAG can contain subqueries.

Definition at line 2696 of file RelAlgDagBuilder.cpp.

References nodes_.

2696  {
2697  for (auto& node : nodes_) {
2698  if (node) {
2699  node->resetQueryExecutionState();
2700  }
2701  }
2702 }
std::vector< std::shared_ptr< RelAlgNode > > nodes_

Member Data Documentation

const Catalog_Namespace::Catalog& RelAlgDagBuilder::cat_
private

Definition at line 1861 of file RelAlgDagBuilder.h.

Referenced by build().

std::vector<std::shared_ptr<RelAlgNode> > RelAlgDagBuilder::nodes_
private
RegisteredQueryHint RelAlgDagBuilder::query_hint_
private

Definition at line 1865 of file RelAlgDagBuilder.h.

Referenced by getQueryHints(), and registerQueryHints().

const RenderInfo* RelAlgDagBuilder::render_info_
private

Definition at line 1864 of file RelAlgDagBuilder.h.

Referenced by build().

std::vector<std::shared_ptr<RexSubQuery> > RelAlgDagBuilder::subqueries_
private

Definition at line 1863 of file RelAlgDagBuilder.h.

Referenced by build(), getSubqueries(), and registerSubquery().


The documentation for this class was generated from the following files: