My research focuses broadly on data-oriented systems and the way they drive computing. Current projects include:
- BOOM and bloom: Orders Of Magnitude simpler code for the Cloud.
- d^p ("deep"): Data to the People
More information on current and past research here.
- Dancing Calmly with the Devil, Keynote, ACM SoCC 2014. [pdf]
- Of Rocket Ships and Washing Machines: Data Technology for People, Keynote, Strata 2012. [video, 10:46]
- Keep CALM and Query On, RICON 2012, UCSD 2013, UCR 2013. [pdf] [video, 49:24].
- Consistency Analysis in Bloom: A CALM and Collected Approach, CIDR
2011. [.pptx], [.pdf]
- The Declarative Imperative: Experiences and Conjectures in Distributed Logic. Keynote, ACM PODS, 2010. [.key.zip], [pdf], [video]
- MAD Skills: New Practices for Big Data. VLDB, 2009. [pptx], [pdf]
- Quantitative Data Cleaning for Large Databases. Keynote,
QDB, 2009. [.key.zip], [pdf]
- Bricolage: Data at Play. Keynote, ICDM 2007. [.key.zip] [.mov] [pdf]
The Marvelous Structure of Reality. Keynote, WebDB 2003 [PDF], [.mov]
- Edelweiss: Automatic Storage Reclamation for Distributed Programming. With N. Conway, P. Alvaro and E. Andrews. VLDB 2014. [pdf]
- Blazes: Coordination Analysis for Distributed
Programs. With P. Alvaro, N. Conway, and D. Maier.
ICDE, 2014. [pdf]
- Highly Available Transactions: Virtues and Limitations. With
P.Bailis, A. Davidson, A. Fekete, A. Ghodsi and I. Stoica. VLDB, 2014. [pdf]
- Quantifying Eventual Consistency with PBS. With
P.Bailis, S. Venkataraman, M.J. Franklin and I. Stoica. VLDB
- Consistency Without Borders. With P. Alvaro, P. Bailis, and N. Conway. ACM SoCC, 2013. [pdf]
- Bolt-on Causal Consistency. With P. Bailis, A. Ghodsi and I. Stoica. SIGMOD, 2013. [pdf]
- Learning and Verifying Quantified Boolean Queries by Example. With A. Abouzied,
D. Angluin, C. Papadimitriou and A. Silberschatz. PODS, 2013. [pdf]
- HAT, not CAP: Towards Highly Available Transactions. With
P. Bailis, A. Fekete, A. Ghodsi and I. Stoica. HotOS, 2013. [pdf]
- Logic and Lattices for Distributed Programming. With W. R. Marczak, P. Alvaro, N. R. Conway, and D. Maier. SoCC, 2012. [pdf]
- The Potential Dangers of Causal Consistency and an Explicit Solution. With P. Bailis, A. Fekete, A. Ghodsi
and I. Stoica. SoCC, 2012. [pdf]
- Enterprise Data Analysis and Visualization: An Interview Study. With S. Kandel, A. Paepcke and J. Heer. IEEE VAST, 2012. [pdf]
- DataPlay: Interactive Tweaking and Example-Driven Correction of Graphical Database Queries. With A. Abouzied and A. Silberschatz. UIST, 2012. [pdf]
- Confluence Analysis for Distributed Programs: A Model-Theoretic Approach. With W. R. Marczak, P. Alvaro, and N. R. Conway. Datalog 2.0, 2012.
- BloomUnit: Declarative Testing for Distributed Programs. With P. Alvaro, N. R. Conway, A. Hutchinson, and W. R. Marczak. DBTest, 2012.
- The MADlib Analytics Library, or MAD Skills, the SQL (with C. Re, F. Schoppmann, D. Z. Wang and many others). VLDB 2012.
- Distributed GraphLab: A Framework for Machine Learning in the Cloud (with Y. Low, J. Gonzalez, A. Kyrola, D. Bickson, and C. Guestrin). VLDB 2012. [pdf]
- Probabilistically Bounded Staleness for Practical Partial Quorums (with P. Bailis, S. Venkataraman, M. J. Franklin and I. Stoica). VLDB 2012. [pdf]
- Profiler: Integrated Statistical Analysis and Visualization for Data Profiling (with S. Kandel, R. Parikh, A. Paepcke and J. Heer). AVI 2012.
- Searching for Jim Gray: a technical overview.
(with D. L. Tennenhouse on behalf of a large team of volunteers).
Commun. ACM 54(7), 2011. [pdf]
- Wrangler: Interactive Visual Specification of Data Transformation Scripts (with S. Kandel, A. Paepcke, and J. Heer). CHI 2011. [PDF]
- Data in the First Mile (with K. Chen and T. Parikh). CIDR 2011 [PDF].
- Consistency Analysis in Bloom: a CALM and Collected Approach (with P. Alvaro, N. Conway, and W.R. Marczak). CIDR 2011. [PDF]
- The Declarative Imperative: Experiences and Conjectures in Distributed Logic. SIGMOD Record 39:1, Sep. 2010. [pdf]
- Declarative Networking (with B. T. Loo, T. Condie, M. Garofalakis, D. E. Gay, P. Maniatis, R. Ramakrishnan, T. Roscoe and I. Stoica). Research Highlights, CACM 52(11), 2009. [Intro by Peter Druschel] [pdf].
- Quantitative Data Cleaning for Large Databases. White paper, United Nations Economic Commission for Europe, February, 2008. [PDF]
- Architecture of a Database System. (with M. Stonebraker and J. Hamilton). Foundations and Trends in Databases 1(2). [PDF]
- Implementing Declarative Overlays. (with B. T. Loo,
T. Condie, P. Maniatis, T. Roscoe, and I. Stoica). In 20th SOSP, 2005. [PDF]
- TinyDB: An Acqusitional Query Processing System for Sensor Networks. (with S. Madden, M. Franklin, and Wei Hong). ACM TODS. [PDF]
- Model-Driven Data Acquisition in Sensor Networks
A. Deshpande, C. Guestrin, S. Madden and W. Hong.) VLDB 2004
- TelegraphCQ: Continuous Dataflow Processing for an
Uncertain World (with the Telegraph team). CIDR 03 [pdf]
- Commencement Address. Computer Science, College
of Letters and Science, UC Berkeley, May 26, 2002. [pdf]
- On a Model of Indexability and its Bounds for Range
E. Koutsoupias, D. Miranker, C. Papadimitriou, and V. Samoladas). JACM
49(1) (2002). [pdf]
- Potter's Wheel: An Interactive Data Cleaning System
Raman). VLDB 2001. [PDF]
Eddies: Continuously Adaptive Query Processing (with
SIGMOD 2000. [PDF]
Interactive Data Analysis with CONTROL (with many
Computer, August 1999. [PDF]
Generalized Search Trees for Database Systems (with J.
and A. Pfeffer.) VLDB 1995. [PS]
in Database Systems, Fourth Edition.
J. M. Hellerstein and M. Stonebraker, eds.
MIT Press, 2005.