Presto: Incorporate WorkProcessor in operators

Created on 18 Dec 2018  路  3Comments  路  Source: prestodb/presto

Issue for effort to support:

  • cross operator lazy pages (starting from source operators)
  • cleanup/simplify contract between operators via WorkProcessor pipelines
  • provide base for further improvements (e.g: on stack rows without Page materialization, Graal)

The advantage of cross operator lazy pages is that we can avoid IO when queries are highly selective. This requires that significant processing happens in source stage, but this becomes more and more the case with improvements like CBO ("broadcast joins") or grouped execution.

Stages are:

  • [x] Stage 1
  • base PageProcessor on WorkProcessor

  • [ ] Stage 2

  • internally base ScanFilterAndProject on WorkProcessor. The pipeline would look like follows:
split singleton -> [flatMap] -> pages source
                -> [transform] -> page processor 
                -> [transform] -> merge pages

or if split is cursor based

split singleton -> [flatMap] -> cursor source -> [transform] -> merge pages
  • internally base FilterAndProject on WorkProcessor. The pipeline would look like follows:
page buffer -> [transform] -> page processor -> [transform] -> [merge pages]
  • [ ] Stage 3
  • create interface for operators that are based on WorkProcessor pipelines
  • create standarized abstract operator class for operators that internally are based on WorkProcessor pipelines
  • combine operators that are based on WorkProcessors via dedicated "gluing" operator
  • base TopNOperator on WorkProcessor pipelines (fast data exploration!)
stale

Most helpful comment

Can you give details about the "Graal" plans?

Work processor provides transformation method:
WorkProcessor#transform
Let's suppose that you have chain of Page transformations, e.g:

WorkProcessor<Page> processor1 = ...;
WorkProcessor<Page> processor2 = processor1.transform(transformation1);
WorkProcessor<Page> processor3 = processor2.transform(transformation2);
...

One can observe that we can compile such chain of Page transformation into a tight loop that doesn't materialize intermediate results. Please checkout paper: http://www.vldb.org/pvldb/vol4/p539-neumann.pdf and project: https://hyper-db.de/.

In order to generate such tight loop one can extend WorkProcessor#transform so that it can generate optimized bytedcode (using existing airlift bytecode framework), e.g:

static <Page, Page> WorkProcessor<Page> transform(
  WorkProcessor<Page> processor,
  Transformation<Page, Page> transformation)
{
  ...
  if (transformation instanceof BytecodeRowTransformation) {
   // generate tight loop
  } else {
   // proceed with intermediate pages materialization
  }
}

interface BytecodeRowTransformation extends Transformation<Page, Page> {
  BytecodeExpression generateTransformation(BytecodeTransformationContext context);
}

interface BytecodeTransformationContext {
  ..
  // transformation result bytecode 
  BytecodeExpression needsMoreData();
  BytecodeExpression producedResult()
  ..
  // input row channels getter bytecode
  BytecodeExpression getChannel(int channel);
  BytecodeExpression isNull(int channel);
  ..
  // output row channel bytecode setters
  void defineChannel(int channel, Supplier<BytecodeExpression> definition);
  void defineIsNull(int channel, Supplier<BytecodeExpression> definition);
  ..
}

BytecodeRowTransformation#generateTransformation would generate bytecode of transformation (using BytecodeTransformationContext to consume input/produce output within generated code).

However generating bytecode is really cumbersome and error prone. Truffle/Graal provides a nice abstraction for creating highly performant interpreters which we could also utilize to generate maintainable and readable WorkProcessor transformations (tutorial on using Truffle: http://cesquivias.github.io/blog/2014/12/02/writing-a-language-in-truffle-part-2-using-truffle-and-graal/). In such case we won't be using BytecodeExpression but much more friendlier classes and annotations mixed with normal type-safe Java code, e.g:

interface TruffleRowTransformation extends Transformation<Page, Page> {
  TruffleNode generateTransformation(TruffleTransformationContext context);
}

interface TruffleTransformationContext {
  ..
  // similar methods as in BytecodeTransformationContext, but using truffle node classes
}

Some notes:

  1. WorkProcessor transformations are functional, so one could actually create a language interpreter for them, e.g:
transform(
  transform(
    processor,
    context -> python transformation),
  context -> java transformation)
  1. Truffle/Graal and WorkProcessor abstraction enables us to use other languages for transformations (e.g: Python). For instance we could implement table functions where such functions are written in non-Java languages, but are JITed into tight loop with Java code.

This is just a draft and I still need to play more with Truffle/Graal in order to obtain more details.

All 3 comments

provide base for further improvements (e.g: on stack rows without Page materialization, Graal)

Can you give details about the "Graal" plans?

Can you give details about the "Graal" plans?

Work processor provides transformation method:
WorkProcessor#transform
Let's suppose that you have chain of Page transformations, e.g:

WorkProcessor<Page> processor1 = ...;
WorkProcessor<Page> processor2 = processor1.transform(transformation1);
WorkProcessor<Page> processor3 = processor2.transform(transformation2);
...

One can observe that we can compile such chain of Page transformation into a tight loop that doesn't materialize intermediate results. Please checkout paper: http://www.vldb.org/pvldb/vol4/p539-neumann.pdf and project: https://hyper-db.de/.

In order to generate such tight loop one can extend WorkProcessor#transform so that it can generate optimized bytedcode (using existing airlift bytecode framework), e.g:

static <Page, Page> WorkProcessor<Page> transform(
  WorkProcessor<Page> processor,
  Transformation<Page, Page> transformation)
{
  ...
  if (transformation instanceof BytecodeRowTransformation) {
   // generate tight loop
  } else {
   // proceed with intermediate pages materialization
  }
}

interface BytecodeRowTransformation extends Transformation<Page, Page> {
  BytecodeExpression generateTransformation(BytecodeTransformationContext context);
}

interface BytecodeTransformationContext {
  ..
  // transformation result bytecode 
  BytecodeExpression needsMoreData();
  BytecodeExpression producedResult()
  ..
  // input row channels getter bytecode
  BytecodeExpression getChannel(int channel);
  BytecodeExpression isNull(int channel);
  ..
  // output row channel bytecode setters
  void defineChannel(int channel, Supplier<BytecodeExpression> definition);
  void defineIsNull(int channel, Supplier<BytecodeExpression> definition);
  ..
}

BytecodeRowTransformation#generateTransformation would generate bytecode of transformation (using BytecodeTransformationContext to consume input/produce output within generated code).

However generating bytecode is really cumbersome and error prone. Truffle/Graal provides a nice abstraction for creating highly performant interpreters which we could also utilize to generate maintainable and readable WorkProcessor transformations (tutorial on using Truffle: http://cesquivias.github.io/blog/2014/12/02/writing-a-language-in-truffle-part-2-using-truffle-and-graal/). In such case we won't be using BytecodeExpression but much more friendlier classes and annotations mixed with normal type-safe Java code, e.g:

interface TruffleRowTransformation extends Transformation<Page, Page> {
  TruffleNode generateTransformation(TruffleTransformationContext context);
}

interface TruffleTransformationContext {
  ..
  // similar methods as in BytecodeTransformationContext, but using truffle node classes
}

Some notes:

  1. WorkProcessor transformations are functional, so one could actually create a language interpreter for them, e.g:
transform(
  transform(
    processor,
    context -> python transformation),
  context -> java transformation)
  1. Truffle/Graal and WorkProcessor abstraction enables us to use other languages for transformations (e.g: Python). For instance we could implement table functions where such functions are written in non-Java languages, but are JITed into tight loop with Java code.

This is just a draft and I still need to play more with Truffle/Graal in order to obtain more details.

This issue has been automatically marked as stale because it has not had any activity in the last 2 years. If you feel that this issue is important, just comment and the stale tag will be removed; otherwise it will be closed in 7 days. This is an attempt to ensure that our open issues remain valuable and relevant so that we can keep track of what needs to be done and prioritize the right things.

Was this page helpful?
0 / 5 - 0 ratings