OmniSciDB  f17484ade4
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Macros Groups Pages
generate_TableFunctionsFactory_init Namespace Reference

Functions

def line_is_incomplete
 
def find_signatures
 
def format_function_args
 
def build_template_function_call
 
def build_preflight_function
 
def must_emit_preflight_function
 
def format_annotations
 
def is_template_function
 
def uses_manager
 
def is_cpu_function
 
def is_gpu_function
 
def parse_annotations
 
def call_methods
 

Variables

string separator = '$=>$'
 
list input_files = [os.path.join(os.path.dirname(__file__), 'test_udtf_signatures.hpp')]
 
tuple cpu_output_header = os.path.splitext(output_filename)
 
tuple gpu_output_header = os.path.splitext(output_filename)
 
list add_stmts = []
 
list cpu_template_functions = []
 
list gpu_template_functions = []
 
list cpu_address_expressions = []
 
list gpu_address_expressions = []
 
list cond_fns = []
 
list canonical_input_files = [input_file[input_file.find("/QueryEngine/") + 1:] for input_file in input_files]
 
list header_file = ['#include "' + canonical_input_file + '"' for canonical_input_file in canonical_input_files]
 
tuple dirname = os.path.dirname(output_filename)
 
tuple add_tf_generated_files
 
tuple cpu_generated_files
 
tuple gpu_generated_files
 
string content
 

Detailed Description

Given a list of input files, scan for lines containing UDTF
specification statements in the following form:

  UDTF: function_name(<arguments>) -> <output column types> (, <template type specifications>)?

where <arguments> is a comma-separated list of argument types. The
argument types specifications are:

- scalar types:
    Int8, Int16, Int32, Int64, Float, Double, Bool, TextEncodingDict, etc
- column types:
    ColumnInt8, ColumnInt16, ColumnInt32, ColumnInt64, ColumnFloat, ColumnDouble, ColumnBool, etc
- column list types:
    ColumnListInt8, ColumnListInt16, ColumnListInt32, ColumnListInt64, ColumnListFloat, ColumnListDouble, ColumnListBool, etc
- cursor type:
    Cursor<t0, t1, ...>
  where t0, t1 are column or column list types
- output buffer size parameter type:
    RowMultiplier<i>, ConstantParameter<i>, Constant<i>, TableFunctionSpecifiedParameter<i>
  where i is a literal integer.

The output column types is a comma-separated list of column types, see above.

In addition, the following equivalents are suppored:

  Column<T> == ColumnT
  ColumnList<T> == ColumnListT
  Cursor<T, V, ...> == Cursor<ColumnT, ColumnV, ...>
  int8 == int8_t == Int8, etc
  float == Float, double == Double, bool == Bool
  T == ColumnT for output column types
  RowMultiplier == RowMultiplier<i> where i is the one-based position of the sizer argument
  when no sizer argument is provided, Constant<1> is assumed

Argument types can be annotated using `|' (bar) symbol after an
argument type specification. An annotation is specified by a label and
a value separated by `=' (equal) symbol. Multiple annotations can be
specified by using `|` (bar) symbol as the annotations separator.
Supported annotation labels are:

- name: to specify argument name
- input_id: to specify the dict id mapping for output TextEncodingDict columns.
- default: to specify a default value for an argument (scalar only)

If argument type follows an identifier, it will be mapped to name
annotations. For example, the following argument type specifications
are equivalent:

  Int8 a
  Int8 | name=a

Template type specifications is a comma separated list of template
type assignments where values are lists of argument type names. For
instance:

  T = [Int8, Int16, Int32, Float], V = [Float, Double]

Function Documentation

def generate_TableFunctionsFactory_init.build_preflight_function (   fn_name,
  sizer,
  input_types,
  output_types,
  uses_manager 
)

Definition at line 203 of file generate_TableFunctionsFactory_init.py.

References format_function_args().

Referenced by parse_annotations().

204 def build_preflight_function(fn_name, sizer, input_types, output_types, uses_manager):
205 
206  def format_error_msg(err_msg, uses_manager):
207  if uses_manager:
208  return " return mgr.error_message(%s);\n" % (err_msg,)
209  else:
210  return " return table_function_error(%s);\n" % (err_msg,)
211 
212  cpp_args, _ = format_function_args(input_types,
213  output_types,
214  uses_manager,
215  use_generic_arg_name=False,
216  emit_output_args=False)
217 
218  if uses_manager:
219  fn = "EXTENSION_NOINLINE int32_t\n"
220  fn += "%s(%s) {\n" % (fn_name.lower() + "__preflight", cpp_args)
221  else:
222  fn = "EXTENSION_NOINLINE int32_t\n"
223  fn += "%s(%s) {\n" % (fn_name.lower() + "__preflight", cpp_args)
224 
225  for typ in input_types:
226  if isinstance(typ, declbracket.Declaration):
227  ann = typ.annotations
228  for key, value in ann:
229  if key == 'require':
230  err_msg = '"Constraint `%s` is not satisfied."' % (value[1:-1])
231 
232  fn += " if (!(%s)) {\n" % (value[1:-1].replace('\\', ''),)
233  fn += format_error_msg(err_msg, uses_manager)
234  fn += " }\n"
235 
236  if sizer.is_arg_sizer():
237  precomputed_nrows = str(sizer.args[0])
238  if '"' in precomputed_nrows:
239  precomputed_nrows = precomputed_nrows[1:-1]
240  # check to see if the precomputed number of rows > 0
241  err_msg = '"Output size expression `%s` evaluated in a negative value."' % (precomputed_nrows)
242  fn += " auto _output_size = %s;\n" % (precomputed_nrows)
243  fn += " if (_output_size < 0) {\n"
244  fn += format_error_msg(err_msg, uses_manager)
245  fn += " }\n"
246  fn += " return _output_size;\n"
247  else:
248  fn += " return 0;\n"
249  fn += "}\n\n"
250 
251  return fn
252 

+ Here is the call graph for this function:

+ Here is the caller graph for this function:

def generate_TableFunctionsFactory_init.build_template_function_call (   caller,
  called,
  input_types,
  output_types,
  uses_manager 
)

Definition at line 189 of file generate_TableFunctionsFactory_init.py.

References format_function_args().

Referenced by parse_annotations().

190 def build_template_function_call(caller, called, input_types, output_types, uses_manager):
191  cpp_args, name_args = format_function_args(input_types,
192  output_types,
193  uses_manager,
194  use_generic_arg_name=True,
195  emit_output_args=True)
196 
197  template = ("EXTENSION_NOINLINE int32_t\n"
198  "%s(%s) {\n"
199  " return %s(%s);\n"
200  "}\n") % (caller, cpp_args, called, name_args)
201  return template
202 

+ Here is the call graph for this function:

+ Here is the caller graph for this function:

def generate_TableFunctionsFactory_init.call_methods (   add_stmts)

Definition at line 481 of file generate_TableFunctionsFactory_init.py.

482 def call_methods(add_stmts):
483  n_add_funcs = linker.GenerateAddTableFunctionsFiles.get_num_generated_files()
484  return [ 'table_functions::add_table_functions_%d();' % (i) for i in range(n_add_funcs+1) ]
485 
def generate_TableFunctionsFactory_init.find_signatures (   input_file)
Returns a list of parsed UDTF signatures.

Definition at line 84 of file generate_TableFunctionsFactory_init.py.

References join(), line_is_incomplete(), heavyai.open(), split(), and run_benchmark_import.type.

Referenced by parse_annotations().

84 
85 def find_signatures(input_file):
86  """Returns a list of parsed UDTF signatures."""
87  signatures = []
88 
89  last_line = None
90  for line in open(input_file).readlines():
91  line = line.strip()
92  if last_line is not None:
93  line = last_line + ' ' + line
94  last_line = None
95  if not line.startswith('UDTF:'):
96  continue
97  if line_is_incomplete(line):
98  last_line = line
99  continue
100  last_line = None
101  line = line[5:].lstrip()
102  i = line.find('(')
103  j = line.find(')')
104  if i == -1 or j == -1:
105  sys.stderr.write('Invalid UDTF specification: `%s`. Skipping.\n' % (line))
106  continue
107 
108  expected_result = None
109  if separator in line:
110  line, expected_result = line.split(separator, 1)
111  expected_result = expected_result.strip().split(separator)
112  expected_result = list(map(lambda s: s.strip(), expected_result))
113 
114  ast = parser.Parser(line).parse()
115 
116  if expected_result is not None:
117  # Treat warnings as errors so that one can test TransformeWarnings
118  warnings.filterwarnings("error")
119 
120  # Template transformer expands templates into multiple lines
121  try:
122  result = transformers.Pipeline(
123  transformers.TemplateTransformer,
124  transformers.AmbiguousSignatureCheckTransformer,
125  transformers.FieldAnnotationTransformer,
126  transformers.TextEncodingDictTransformer,
127  transformers.DefaultValueAnnotationTransformer,
128  transformers.SupportedAnnotationsTransformer,
129  transformers.RangeAnnotationTransformer,
130  transformers.FixRowMultiplierPosArgTransformer,
131  transformers.RenameNodesTransformer,
132  transformers.AstPrinter)(ast)
133  except (transformers.TransformerException, transformers.TransformerWarning) as msg:
134  result = ['%s: %s' % (type(msg).__name__, msg)]
135  assert len(result) == len(expected_result), "\n\tresult: %s \n!= \n\texpected: %s" % (
136  '\n\t\t '.join(result),
137  '\n\t\t '.join(expected_result)
138  )
139  assert set(result) == set(expected_result), "\n\tresult: %s != \n\texpected: %s" % (
140  '\n\t\t '.join(result),
141  '\n\t\t '.join(expected_result),
142  )
143 
144  else:
145  signature = transformers.Pipeline(
146  transformers.TemplateTransformer,
147  transformers.AmbiguousSignatureCheckTransformer,
148  transformers.FieldAnnotationTransformer,
149  transformers.TextEncodingDictTransformer,
150  transformers.DefaultValueAnnotationTransformer,
151  transformers.SupportedAnnotationsTransformer,
152  transformers.RangeAnnotationTransformer,
153  transformers.FixRowMultiplierPosArgTransformer,
154  transformers.RenameNodesTransformer,
155  transformers.DeclBracketTransformer)(ast)
156 
157  signatures.extend(signature)
158 
159  return signatures
160 
std::string join(T const &container, std::string const &delim)
std::vector< std::string > split(std::string_view str, std::string_view delim, std::optional< size_t > maxsplit)
split apart a string into a vector of substrings
int open(const char *path, int flags, int mode)
Definition: heavyai_fs.cpp:66

+ Here is the call graph for this function:

+ Here is the caller graph for this function:

def generate_TableFunctionsFactory_init.format_annotations (   annotations_)

Definition at line 263 of file generate_TableFunctionsFactory_init.py.

References join().

Referenced by parse_annotations().

264 def format_annotations(annotations_):
265  def fmt(k, v):
266  # type(v) is not always 'str'
267  if k == 'require' or k == 'default' and v[0] == "\"":
268  return v[1:-1]
269  return v
270 
271  s = "std::vector<std::map<std::string, std::string>>{"
272  s += ', '.join(('{' + ', '.join('{"%s", "%s"}' % (k, fmt(k, v)) for k, v in a) + '}') for a in annotations_)
273  s += "}"
274  return s
275 
std::string join(T const &container, std::string const &delim)

+ Here is the call graph for this function:

+ Here is the caller graph for this function:

def generate_TableFunctionsFactory_init.format_function_args (   input_types,
  output_types,
  uses_manager,
  use_generic_arg_name,
  emit_output_args 
)

Definition at line 161 of file generate_TableFunctionsFactory_init.py.

References join().

Referenced by build_preflight_function(), and build_template_function_call().

162 def format_function_args(input_types, output_types, uses_manager, use_generic_arg_name, emit_output_args):
163  cpp_args = []
164  name_args = []
165 
166  if uses_manager:
167  cpp_args.append('TableFunctionManager& mgr')
168  name_args.append('mgr')
169 
170  for idx, typ in enumerate(input_types):
171  cpp_arg, name = typ.format_cpp_type(idx,
172  use_generic_arg_name=use_generic_arg_name,
173  is_input=True)
174  cpp_args.append(cpp_arg)
175  name_args.append(name)
176 
177  if emit_output_args:
178  for idx, typ in enumerate(output_types):
179  cpp_arg, name = typ.format_cpp_type(idx,
180  use_generic_arg_name=use_generic_arg_name,
181  is_input=False)
182  cpp_args.append(cpp_arg)
183  name_args.append(name)
184 
185  cpp_args = ', '.join(cpp_args)
186  name_args = ', '.join(name_args)
187  return cpp_args, name_args
188 
std::string join(T const &container, std::string const &delim)

+ Here is the call graph for this function:

+ Here is the caller graph for this function:

def generate_TableFunctionsFactory_init.is_cpu_function (   sig)

Definition at line 285 of file generate_TableFunctionsFactory_init.py.

References uses_manager().

Referenced by parse_annotations().

286 def is_cpu_function(sig):
287  # Any function that does not have _gpu_ suffix is a cpu function.
288  i = sig.name.rfind('_gpu_')
289  if i >= 0 and '__' in sig.name[:i + 1]:
290  if uses_manager(sig):
291  raise ValueError('Table function {} with gpu execution target cannot have TableFunctionManager argument'.format(sig.name))
292  return False
293  return True
294 

+ Here is the call graph for this function:

+ Here is the caller graph for this function:

def generate_TableFunctionsFactory_init.is_gpu_function (   sig)

Definition at line 295 of file generate_TableFunctionsFactory_init.py.

References uses_manager().

Referenced by parse_annotations().

296 def is_gpu_function(sig):
297  # A function with TableFunctionManager argument is a cpu-only function
298  if uses_manager(sig):
299  return False
300  # Any function that does not have _cpu_ suffix is a gpu function.
301  i = sig.name.rfind('_cpu_')
302  return not (i >= 0 and '__' in sig.name[:i + 1])
303 

+ Here is the call graph for this function:

+ Here is the caller graph for this function:

def generate_TableFunctionsFactory_init.is_template_function (   sig)

Definition at line 276 of file generate_TableFunctionsFactory_init.py.

Referenced by parse_annotations().

277 def is_template_function(sig):
278  i = sig.name.rfind('_template')
279  return i >= 0 and '__' in sig.name[:i + 1]
280 

+ Here is the caller graph for this function:

def generate_TableFunctionsFactory_init.line_is_incomplete (   line)

Definition at line 77 of file generate_TableFunctionsFactory_init.py.

Referenced by find_signatures().

77 
78 def line_is_incomplete(line):
79  # TODO: try to parse the line to be certain about completeness.
80  # `$=>$' is used to separate the UDTF signature and the expected result
81  return line.endswith(',') or line.endswith('->') or line.endswith(separator) or line.endswith('|')
82 
83 
# fmt: off

+ Here is the caller graph for this function:

def generate_TableFunctionsFactory_init.must_emit_preflight_function (   sig,
  sizer 
)

Definition at line 253 of file generate_TableFunctionsFactory_init.py.

Referenced by parse_annotations().

254 def must_emit_preflight_function(sig, sizer):
255  if sizer.is_arg_sizer():
256  return True
257  for arg_annotations in sig.input_annotations:
258  d = dict(arg_annotations)
259  if 'require' in d.keys():
260  return True
261  return False
262 

+ Here is the caller graph for this function:

def generate_TableFunctionsFactory_init.parse_annotations (   input_files)

Definition at line 304 of file generate_TableFunctionsFactory_init.py.

References build_preflight_function(), build_template_function_call(), find_signatures(), format_annotations(), is_cpu_function(), is_gpu_function(), is_template_function(), join(), and must_emit_preflight_function().

305 def parse_annotations(input_files):
306 
307  counter = 0
308 
309  add_stmts = []
310  cpu_template_functions = []
311  gpu_template_functions = []
312  cpu_function_address_expressions = []
313  gpu_function_address_expressions = []
314  cond_fns = []
315 
316  for input_file in input_files:
317  for sig in find_signatures(input_file):
318 
319  # Compute sql_types, input_types, and sizer
320  sql_types_ = []
321  input_types_ = []
322  input_annotations = []
323 
324  sizer = None
325  if sig.sizer is not None:
326  expr = sig.sizer.value
327  sizer = declbracket.Bracket('kPreFlightParameter', (expr,))
328 
329  uses_manager = False
330  for i, (t, annot) in enumerate(zip(sig.inputs, sig.input_annotations)):
331  if t.is_output_buffer_sizer():
332  if t.is_user_specified():
333  sql_types_.append(declbracket.Bracket.parse('int32').normalize(kind='input'))
334  input_types_.append(sql_types_[-1])
335  input_annotations.append(annot)
336  assert sizer is None # exactly one sizer argument is allowed
337  assert len(t.args) == 1, t
338  sizer = t
339  elif t.name == 'Cursor':
340  for t_ in t.args:
341  input_types_.append(t_)
342  input_annotations.append(annot)
343  sql_types_.append(declbracket.Bracket('Cursor', args=()))
344  elif t.name == 'TableFunctionManager':
345  if i != 0:
346  raise ValueError('{} must appear as a first argument of {}, but found it at position {}.'.format(t, sig.name, i))
347  uses_manager = True
348  else:
349  input_types_.append(t)
350  input_annotations.append(annot)
351  if t.is_column_any():
352  # XXX: let Bracket handle mapping of column to cursor(column)
353  sql_types_.append(declbracket.Bracket('Cursor', args=()))
354  else:
355  sql_types_.append(t)
356 
357  if sizer is None:
358  name = 'kTableFunctionSpecifiedParameter'
359  idx = 1 # this sizer is not actually materialized in the UDTF
360  sizer = declbracket.Bracket(name, (idx,))
361 
362  assert sizer is not None
363  ns_output_types = tuple([a.apply_namespace(ns='ExtArgumentType') for a in sig.outputs])
364  ns_input_types = tuple([t.apply_namespace(ns='ExtArgumentType') for t in input_types_])
365  ns_sql_types = tuple([t.apply_namespace(ns='ExtArgumentType') for t in sql_types_])
366 
367  sig.function_annotations.append(('uses_manager', str(uses_manager).lower()))
368 
369  input_types = 'std::vector<ExtArgumentType>{%s}' % (', '.join(map(util.tostring, ns_input_types)))
370  output_types = 'std::vector<ExtArgumentType>{%s}' % (', '.join(map(util.tostring, ns_output_types)))
371  sql_types = 'std::vector<ExtArgumentType>{%s}' % (', '.join(map(util.tostring, ns_sql_types)))
372  annotations = format_annotations(input_annotations + sig.output_annotations + [sig.function_annotations])
373 
374  # Notice that input_types and sig.input_types, (and
375  # similarly, input_annotations and sig.input_annotations)
376  # have different lengths when the sizer argument is
377  # Constant or TableFunctionSpecifiedParameter. That is,
378  # input_types contains all the user-specified arguments
379  # while sig.input_types contains all arguments of the
380  # implementation of an UDTF.
381 
382  if must_emit_preflight_function(sig, sizer):
383  fn_name = '%s_%s' % (sig.name, str(counter)) if is_template_function(sig) else sig.name
384  check_fn = build_preflight_function(fn_name, sizer, input_types_, sig.outputs, uses_manager)
385  cond_fns.append(check_fn)
386 
387  if is_template_function(sig):
388  name = sig.name + '_' + str(counter)
389  counter += 1
390  t = build_template_function_call(name, sig.name, input_types_, sig.outputs, uses_manager)
391  address_expression = ('avoid_opt_address(reinterpret_cast<void*>(%s))' % name)
392  if is_cpu_function(sig):
393  cpu_template_functions.append(t)
394  cpu_function_address_expressions.append(address_expression)
395  if is_gpu_function(sig):
396  gpu_template_functions.append(t)
397  gpu_function_address_expressions.append(address_expression)
398  add = ('TableFunctionsFactory::add("%s", %s, %s, %s, %s, %s, /*is_runtime:*/false);'
399  % (name, sizer.format_sizer(), input_types, output_types, sql_types, annotations))
400  add_stmts.append(add)
401 
402  else:
403  add = ('TableFunctionsFactory::add("%s", %s, %s, %s, %s, %s, /*is_runtime:*/false);'
404  % (sig.name, sizer.format_sizer(), input_types, output_types, sql_types, annotations))
405  add_stmts.append(add)
406  address_expression = ('avoid_opt_address(reinterpret_cast<void*>(%s))' % sig.name)
407 
408  if is_cpu_function(sig):
409  cpu_function_address_expressions.append(address_expression)
410  if is_gpu_function(sig):
411  gpu_function_address_expressions.append(address_expression)
412 
413  return add_stmts, cpu_template_functions, gpu_template_functions, cpu_function_address_expressions, gpu_function_address_expressions, cond_fns
414 
415 
416 
std::string join(T const &container, std::string const &delim)

+ Here is the call graph for this function:

def generate_TableFunctionsFactory_init.uses_manager (   sig)

Definition at line 281 of file generate_TableFunctionsFactory_init.py.

Referenced by table_functions::TableFunctionsFactory.add(), is_cpu_function(), is_gpu_function(), and com.mapd.parser.server.ExtensionFunctionSignatureParser.toSignature().

282 def uses_manager(sig):
283  return sig.inputs and sig.inputs[0].name == 'TableFunctionManager'
284 

+ Here is the caller graph for this function:

Variable Documentation

list generate_TableFunctionsFactory_init.add_stmts = []

Definition at line 432 of file generate_TableFunctionsFactory_init.py.

tuple generate_TableFunctionsFactory_init.add_tf_generated_files
Initial value:
1 = linker.GenerateAddTableFunctionsFiles(dirname, stmts,
2  header_file)

Definition at line 465 of file generate_TableFunctionsFactory_init.py.

list generate_TableFunctionsFactory_init.canonical_input_files = [input_file[input_file.find("/QueryEngine/") + 1:] for input_file in input_files]

Definition at line 439 of file generate_TableFunctionsFactory_init.py.

list generate_TableFunctionsFactory_init.cond_fns = []

Definition at line 437 of file generate_TableFunctionsFactory_init.py.

tuple generate_TableFunctionsFactory_init.content

Definition at line 486 of file generate_TableFunctionsFactory_init.py.

list generate_TableFunctionsFactory_init.cpu_address_expressions = []

Definition at line 435 of file generate_TableFunctionsFactory_init.py.

tuple generate_TableFunctionsFactory_init.cpu_generated_files
Initial value:
1 = linker.GenerateTemplateFiles(dirname, cpu_fns,
2  header_file, 'cpu')

Definition at line 471 of file generate_TableFunctionsFactory_init.py.

tuple generate_TableFunctionsFactory_init.cpu_output_header = os.path.splitext(output_filename)

Definition at line 428 of file generate_TableFunctionsFactory_init.py.

list generate_TableFunctionsFactory_init.cpu_template_functions = []

Definition at line 433 of file generate_TableFunctionsFactory_init.py.

tuple generate_TableFunctionsFactory_init.dirname = os.path.dirname(output_filename)

Definition at line 442 of file generate_TableFunctionsFactory_init.py.

list generate_TableFunctionsFactory_init.gpu_address_expressions = []

Definition at line 436 of file generate_TableFunctionsFactory_init.py.

tuple generate_TableFunctionsFactory_init.gpu_generated_files
Initial value:
1 = linker.GenerateTemplateFiles(dirname, gpu_fns,
2  header_file, 'gpu')

Definition at line 476 of file generate_TableFunctionsFactory_init.py.

tuple generate_TableFunctionsFactory_init.gpu_output_header = os.path.splitext(output_filename)

Definition at line 429 of file generate_TableFunctionsFactory_init.py.

list generate_TableFunctionsFactory_init.gpu_template_functions = []

Definition at line 434 of file generate_TableFunctionsFactory_init.py.

list generate_TableFunctionsFactory_init.header_file = ['#include "' + canonical_input_file + '"' for canonical_input_file in canonical_input_files]

Definition at line 440 of file generate_TableFunctionsFactory_init.py.

list generate_TableFunctionsFactory_init.input_files = [os.path.join(os.path.dirname(__file__), 'test_udtf_signatures.hpp')]

Definition at line 419 of file generate_TableFunctionsFactory_init.py.

string generate_TableFunctionsFactory_init.separator = '$=>$'

Definition at line 75 of file generate_TableFunctionsFactory_init.py.

Referenced by foreign_storage::anonymous_namespace{AbstractFileStorageDataWrapper.cpp}.append_file_path(), ai.heavy.jdbc.HeavyAIEscapeFunctions.appendCall(), foreign_storage::AbstractFileStorageDataWrapper.getFullFilePath(), import_export.import_thread_shapefile(), and pop_n_rows_from_merged_heaps_gpu().