Advanced Topics in Computer Systems
Joe Hellerstein & Eric Brewer

Degrees of Consistency (a/k/a Isolation Levels)

Despite all the discussion of ACID, sometimes it's nice to sacrifice semantic guarantees for the sake of performance.  The goal is to let individual transactions choose this WITHOUT messing up the database or the other transactions that do care.

Gray, et al.: Degrees of Consistency

First, a definition: A write is committed when transaction if finished; otherwise, the write is dirty.

A Locking-Based Description of Degrees of Consistency:

This is not actually a description of the degrees, but rather of how to achieve them via locking. But it’s better defined.

A Dirty-Data Description of Degrees of Consistency

Transaction T sees degree X consistency if...

Examples of Inconsistencies prevented by Various Degrees NOTE: if everybody is at least degree 1, than different transactions can CHOOSE what degree they wish to "see" without worry.  I.e. can have a mixture of levels of consistency.

Oracle's Snapshot Isolation

A.K.A. "SERIALIZABLE" in Oracle. Orwellian!

Idea: Give each transaction a timestamp, and a "snapshot" of the DBMS at transaction begin. Then install their writes at commit time. Read-only transactions never block or get rolled back!

Technique: "archive on write". I.e. move old versions of tuples out of the way, but don't throw them out. A snapshot isolation schedule:
T1: R(A0), R(B0),                          W(A1), C
T2:                R(A0), R(B0), W(B2), C
Now, B2 = f(A0, B0); A2 = g(A0, B0). Is this schedule serializable? Example: "Write Skew"! There are subtler problems as well, see O'Neil paper.

Still, despite IBM complaining that these anomalies mean the technique is "broken", Snapshot Isolation is popular and works "most of the time" (and certainly on leading benchmarks.)

Question: How do you extend Snapshot Isolation to give true serializable schedules? Cost?

Adya, et al. : Generalized Isolation Levels

Gray et al's definitions (and the resulting ANSI standards) are not implementation-independent, and semantics are ill-defined.

Want an implementation-independent semantic isolation levels which is as permissive as possible (most possible schedules allowed).

Key insight: many dependencies are multi-object (recall Snapshot Isolation examples!) Capture those, and you'll get the right semantics.

Conflicts in Adya's Serialization Graphs:

Direct Serialization Graph: Now we can talk about isolation in terms of serialization graphs and "histories" ("schedules"), NOT implementation.

Adya's Isolation Levels
Try to Generalize Gray's.  PL-x = "Portable Level x".

Modeling Mixed-Mode Systems