OmniSciDB  8fa3bf436f
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Macros Pages
JoinFilterPushDown.h File Reference
#include <cstddef>
#include <numeric>
#include "QueryEngine/InputMetadata.h"
#include "QueryEngine/RangeTableIndexVisitor.h"
+ Include dependency graph for JoinFilterPushDown.h:
+ This graph shows which files directly or indirectly include this file:

Go to the source code of this file.

Classes

struct  PushedDownFilterInfo
 
struct  FilterSelectivity
 

Functions

bool to_gather_info_for_filter_selectivity (const std::vector< InputTableInfo > &table_infos)
 
std::vector< PushedDownFilterInfofind_push_down_filters (const RelAlgExecutionUnit &ra_exe_unit, const std::vector< size_t > &input_permutation, const std::vector< size_t > &left_deep_join_input_sizes)
 

Variables

bool g_enable_filter_push_down
 
float g_filter_push_down_low_frac
 
float g_filter_push_down_high_frac
 
size_t g_filter_push_down_passing_row_ubound
 

Function Documentation

std::vector<PushedDownFilterInfo> find_push_down_filters ( const RelAlgExecutionUnit ra_exe_unit,
const std::vector< size_t > &  input_permutation,
const std::vector< size_t > &  left_deep_join_input_sizes 
)

Go through all tables involved in the relational algebra plan, and select potential candidates to be pushed down by calcite. For each filter we store a set of intermediate indices (previous, current, and next table) based on the column indices in their query string.

Definition at line 234 of file JoinFilterPushDown.cpp.

References CHECK_EQ, CHECK_GE, CHECK_LT, i, RelAlgExecutionUnit::input_descs, gpu_enabled::iota(), RelAlgExecutionUnit::join_quals, gpu_enabled::partial_sum(), run_benchmark_import::result, and ScalarExprVisitor< T >::visit().

Referenced by RelAlgExecutor::selectFiltersToBePushedDown().

237  {
238  std::vector<PushedDownFilterInfo> result;
239  if (left_deep_join_input_sizes.empty()) {
240  return result;
241  }
242  std::vector<size_t> input_size_prefix_sums(left_deep_join_input_sizes.size());
243  std::partial_sum(left_deep_join_input_sizes.begin(),
244  left_deep_join_input_sizes.end(),
245  input_size_prefix_sums.begin());
246  std::vector<int> to_original_rte_idx(ra_exe_unit.input_descs.size(),
247  ra_exe_unit.input_descs.size());
248  if (!input_permutation.empty()) {
249  CHECK_EQ(to_original_rte_idx.size(), input_permutation.size());
250  for (size_t i = 0; i < input_permutation.size(); ++i) {
251  CHECK_LT(input_permutation[i], to_original_rte_idx.size());
252  CHECK_EQ(static_cast<size_t>(to_original_rte_idx[input_permutation[i]]),
253  to_original_rte_idx.size());
254  to_original_rte_idx[input_permutation[i]] = i;
255  }
256  } else {
257  std::iota(to_original_rte_idx.begin(), to_original_rte_idx.end(), 0);
258  }
259  std::unordered_map<int, std::vector<std::shared_ptr<Analyzer::Expr>>>
260  filters_per_nesting_level;
261  for (const auto& level_conditions : ra_exe_unit.join_quals) {
263  for (const auto& cond : level_conditions.quals) {
264  const auto rte_indices = visitor.visit(cond.get());
265  if (rte_indices.size() > 1) {
266  continue;
267  }
268  const int rte_idx = (!rte_indices.empty()) ? *rte_indices.cbegin() : 0;
269  if (!rte_idx) {
270  continue;
271  }
272  CHECK_GE(rte_idx, 0);
273  CHECK_LT(static_cast<size_t>(rte_idx), to_original_rte_idx.size());
274  filters_per_nesting_level[to_original_rte_idx[rte_idx]].push_back(cond);
275  }
276  }
277  for (const auto& kv : filters_per_nesting_level) {
278  CHECK_GE(kv.first, 0);
279  CHECK_LT(static_cast<size_t>(kv.first), input_size_prefix_sums.size());
280  size_t input_prev = (kv.first > 1) ? input_size_prefix_sums[kv.first - 2] : 0;
281  size_t input_start = kv.first ? input_size_prefix_sums[kv.first - 1] : 0;
282  size_t input_next = input_size_prefix_sums[kv.first];
283  result.emplace_back(
284  PushedDownFilterInfo{kv.second, input_prev, input_start, input_next});
285  }
286  return result;
287 }
#define CHECK_EQ(x, y)
Definition: Logger.h:211
std::vector< InputDescriptor > input_descs
#define CHECK_GE(x, y)
Definition: Logger.h:216
T visit(const Analyzer::Expr *expr) const
const JoinQualsPerNestingLevel join_quals
DEVICE void partial_sum(ARGS &&...args)
Definition: gpu_enabled.h:87
#define CHECK_LT(x, y)
Definition: Logger.h:213
DEVICE void iota(ARGS &&...args)
Definition: gpu_enabled.h:69

+ Here is the call graph for this function:

+ Here is the caller graph for this function:

bool to_gather_info_for_filter_selectivity ( const std::vector< InputTableInfo > &  table_infos)

The main purpose of this function is to prevent going through extra overhead of computing required statistics for finding the right candidates and then the actual push-down, unless the problem is large enough that such effort is potentially helpful.

Definition at line 207 of file JoinFilterPushDown.cpp.

Referenced by RelAlgExecutor::selectFiltersToBePushedDown().

208  {
209  if (table_infos.size() < 2) {
210  return false;
211  }
212  // we currently do not support filter push down when there is a self-join involved:
213  // TODO(Saman): prevent Calcite from optimizing self-joins to remove this exclusion
214  std::unordered_set<int> table_ids;
215  for (auto ti : table_infos) {
216  if (table_ids.find(ti.table_id) == table_ids.end()) {
217  table_ids.insert(ti.table_id);
218  } else {
219  // a self-join is involved
220  return false;
221  }
222  }
223  // TODO(Saman): add some extra heuristics to avoid preflight count and push down if it
224  // is not going to be helpful.
225  return true;
226 }

+ Here is the caller graph for this function:

Variable Documentation

bool g_enable_filter_push_down

Definition at line 89 of file Execute.cpp.

float g_filter_push_down_high_frac

Definition at line 91 of file Execute.cpp.

float g_filter_push_down_low_frac

Definition at line 90 of file Execute.cpp.

size_t g_filter_push_down_passing_row_ubound

Definition at line 92 of file Execute.cpp.