Relational queries 101

Themes: Dataflow and Dataflow Operators

A quick primer on the relational algebra

Dataflow Architecture: Iterators

Basic question: how do you turn the algebraic operator abstraction into a programming construct?  Dataflow.

An OO view of iterators: An iterator is a superclass for all dataflow operators.  It has three methods:

The output type of next() is well typed.  In a SQL system, this is a SQL record in some schema, e.g. (integer, numeric(10,2), varchar, float).

Iterators are strung together into a dataflow by parameterizing them to have other iterators as inputs, and then having their methods invoke the input iterators' methods.  Usually parent.open() invokes child.open() and sometimes child.next().  Usually parent.next() invokes child.next() or nothing.  Usually parent.close() invokes child.close().

An iterator also has a number of private data members, including all the "input" iterators it invokes (usually one or two of these), and any state it maintains.  The types of the records from input iterators need to match the operator semantics in generating the output type.  In a DBMS, this is all set up automatically at query parsing/optimization time.  Extensible "optimizer generators" have been developed that let you declare these rules in a high-level way (not as optimizer code).

Dataflow Design Space

Typically, iterators in a single-site query processor make synchronous calls to their children.  This is a "pull" model, like sucking data through a straw.  For now, it's fine to think of iterators that way.

As we've seen with P2 and we'll see again with Click, in some contexts pull is not the best choice -- particularly if there's asynchrony in your system (e.g. unpredictable networks).  Some kind of buffering and/or rate-matching has to happen to keep operators in sync.

Typical DB Dataflow Operators

Unary Operations

Binary Operations: Union

Without duplicate elimination, a very simple iterator (SQL's "UNION ALL").  Only issue is that both inputs have to have identical schemas.

With duplicate elimination (SQL's plain "UNION"), you need to eliminate duplicates too.  Simple implementation: do UNION ALL as one iterator, and dup-elim as another.

Binary Operations: Join

Join algorithms apply to almost any form of combining multiple collections, except for UNION.

Some commonly used join variants (alternative logical algebra operators):

These logical algebra operators can be implemented as minor variations on the same join algorithms!

The "Guy Lohman Test" for join operators (a.k.a. pipelining of intermediate results):

The "Joe Hellerstein Test" for join operators (a.k.a. full pipelining):

Nested Loops Join

for each tuple r of R
    for each tuple s of S
       if rs satisfies join predicate
          output rs
R is the outer relation (left)

S is the inner relation (right)

Refinement 1: Block Nested Loops Join

for each block BR of R
    for each tuple s of S
        for each tuple r of BR such that rs satisfies join predicate
               output rs
Further refinements to nested loops:

Refinement 2: Index Nested Loops Join

for each tuple r of R
    probe index over S;
    output all s s.t. rs satisfies join predicate;
Notes:
        SELECT cities.name
          FROM cities, forests
         WHERE cities.boundary overlaps forests.boundary;

(Sort-)Merge Join

Works for equijoin, "band" joins

we will assume here you know how to do a 2-pass sort [see Knuth or Shapiro]

idea: if R & S are sorted on the join column(s), we can "simply" merge them

But duplicates complicate things: little embedded nested loop on dups.

Refinement: do merging of R & S while merging sort runs from each.

Note: Sort-merge is symmetric, so "outer", "inner", "left", "right" are arbitrary

Grace Hash Join

Phase 1 is repeated with S in place of R

   THEN 

Advantages:

Disadvantages:

Hybrid Hash

Original paper: DeWitt, Katz, Olken, Shapiro, Stonebraker, Wood, SIGMOD '84.

Phase 2 as in Grace Join

Hybrid Hash Advantages:

Disadvantages: Handling Partition Overflow: Additional Tricks: Filters

Idea: build a filter based on R so you stage less of S to disk

Parallel DB 101

Performance metrics:

2 kinds of parallelism

3 barriers to linearity:

3 basic architectures

data placement

note the history

The Exchange Operator: Parallelizing Iterators

Annoyance: "parallelizing" a query executor is a hassle.
Solution: encapsulate the parallelism in a query operator, not in the QP infrastructure.
    - another classic level-of-indirection scheme, this time for dataflows.

We want to enable partition and pipelined parallel dataflows over iterators -- including setup, teardown, and runtime logic -- in a clean encapsulated way.

The exchange operator: an operator you pop into any single-site dataflow graph as desired -- anonymous to the other operators.

Implementation:


Benefits of exchange:

  1. opaquely handles setup and teardown of clones (in an SMP...for shared-nothing, would need to have daemons at each site, and a protocol to request clone spawning)
  2. at the top of a local subplan, allows pipeline parallelism: turns iterator-based, unithreaded "pull" into network-based, cross-thread "push".
  3. at the top of a local subplan, allows decoupling of children's scheduling.
  4. inside a subplan, can mix pull and push to get the best of both


"Extensibility" features of Volcano and exchange: