Analyses on large volumes of data are being used to improve the effectiveness of advertising, fight cancer, and solve crimes. Many organizations are scrambling to take better advantage of an emerging set of tools for large-scale data processing, and there is a huge emerging market around providing both applications and infrastructure for working with data at scale.
In this course, we will study the systems challenges that surround large-scale data collection, analysis, transformation, and access.
The premise of our discussion will be that, while there is a growing set of off-the-shelf tools available to build large-scale information systems, these tools often achieve scale and (some amount of) ease of use at the expense of efficiency. As an example of this property, Yahoo!'s record in this year's Gray Sort Benchmark manages to sort an astonishing 1.42TB of data per minute. It achieves this using 2.1 thousand servers, each with 12 spinning disks. The underlying system is achieving a disk bandwidth of less than 1% of what the hardware should be capable of.
These software systems are deeply layered, distributed, and complex. Further, the hardware that is deployed in the datacenter has undergone very significant changes over the past ten years, and does not necessarily match the assumptions with which many existing systems are designed. We will start and end the course by examining some example large-scale analytics platforms and we will spend the body of the course exploring aspects of low-level system design in order to gain a better understanding of the challenges involved in computing at very large scales.
The aim of this course is to make you think about large complex software systems from top to bottom, and to take a whole-system approach to understanding performance and efficiency at scale.
The course will be very discussion focused. You should expect to be participating in discussion during every single lecture.
Mon, Sept 9 | Bootstrap.How to Read a Paper Srinivasan Keshav Writing reviews for systems conferences Timothy Roscoe |
Wed, Sept 11 | Distributed ComputeMapReduce: Simplified Data Processing on Large Clusters. Jeffrey Dean and Sanjay Ghemawat. (OSDI'04) Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing. Matei Zaharia, Mosharaf Chowdhury, Tathagata Das, Ankur Dave, Justin Ma, Murphy McCauley, Michael J. Franklin, Scott Shenker, Ion Stoica. (NDSI'12) |
Mon, Sept 16 |
Dryad:
Distributed Data-Parallel Programs from Sequential Building
Blocks. Michael Isard, Mihai Budiu, Yuan Yu, Andrew Birrell, and
Dennis Fetterly. (EuroSys'07) TritonSort: A Balanced Large-Scale Sorting System. Alexander Rasmussen, George Porter, Michael Conley, Harsha V. Madhyasthay Radhika Niranjan Mysore, Alexander Pucher, Amin Vahdat. (NSDI'11) Mihir Nanavati covering for Andy |
Wed, Sept 18 | All university classes cancelled. |
Mon, Sept 23 | Storage at ScaleFAWN: A Fast Array of Wimpy Nodes. David Andersen, Jason Franklin, Michael Kaminsky, Amar Phanishayee, Lawrence Tan, Vijay Vasudevan. (SOSP'12) Fast crash recovery in RAMCloud. Diego Ongaro, Stephen M. Rumble, Ryan Stutsman, John Ousterhout, and Mendel Rosenblum. (SOSP'11) |
Wed, Sept 25 |
Windows Azure Storage: A Highly Available Cloud Storage
Service with Strong Consistency. Brad Calder et al. (SOSP'11) Finding a needle in Haystack: Facebook’s photo storage. Doug Beaver, Sanjeev Kumar, Harry C. Li, Jason Sobel, Peter Vajgel. (OSDI'10) |
Mon, Sept 30 |
The Google File System. Sanjay Ghemawat, Howard
Gobioff, and Shun-Tak Leung. (SOSP'03) Spanner: Google's Globally-Distributed Database. James C Corbett et al. (OSDI'12) |
Wed, Oct 2 | Ghosts of Storage PastPetal: Distributed virtual disks. Edward K. Lee and Chandramohan A. Thekkath. (ASPLOS'96) Frangipani: a scalable distributed file system. Chandramohan A. Thekkath, Timothy Mann, Edward K. Lee. (SOSP'97) |
Mon, Oct 7 |
The design and implementation of a log-structured file
system. Mendel Rosenblum and John K. Ousterhout. Overview of the Spiralog file system.James E. Johnson and William A. Laing. |
Wed, Oct 9 | Solid StateCORFU: A Shared Log Design for Flash Clusters. Mahesh Balakrishnan, Dahlia Malkhi, Vijayan Prabhakaran, and Ted Wobber, Michael Wei, John D. Davis. (NSDI'12) Tango: Distributed Data Structures over a Shared Log. Mahesh Balakrishnan et al. (SOSP'13) |
Mon, Oct 14 | Thanksgiving. |
Wed, Oct 16 |
Gecko:
Contention-Oblivious Disk Arrays for Cloud Storage. Ji Yong Shin,
Mahesh Balakrishnan,Tudor Marian, Hakim Weatherspoon Linux Block IO: Introducing Multi-queue SSD Access on Multi-core Systems. Matias Bjørling, Jens Axboe, David Nellans, Philippe Bonnet. (SYSTOR'13) |
Mon, Oct 21 | Datacenter NetworksPortLand: A Scalable Fault-Tolerant Layer 2 Data Center Network Fabric. Radhika Niranjan Mysore, Andreas Pamboris, Nathan Farrington, Nelson Huang, Pardis Miri, Sivasankar Radhakrishnan, Vikram Subramanya, and Amin Vahdat. (SIGCOMM'09) Integrating Microsecond Circuit Switching into the Data Center. George Porter Richard Strong Nathan Farrington Alex Forencich Pang Chen-Sun Tajana Rosing Yeshaiahu Fainman George Papen Amin Vahdaty. (SIGCOMM'13) Jake Wires covering for Andy. |
Wed, Oct 23 |
ServerSwitch: A Programmable and High Performance Platform for Data Center Networks.
Guohan Lu, Chuanxiong Guo, Yulong Li, Zhiqiang Zhou, Tong Yuan, Haitao
Wu, Yongqiang Xiong, Rui Gao, and Yongguang Zhang. (NSDI'11) Flat Datacenter Storage. Edmund B. Nightingale, Jeremy Elson, Jinliang Fan, Owen Hofmann∗ , Jon Howell, and Yutaka Suzue (OSDI'12) |
Mon, Oct 28 |
IOFlow: A Software-Defined Storage Architecture
Eno Thereska, Hitesh Ballani, Greg O'Shea, Thomas Karagiannis, Antony
Rowstron, Tom Talpey , Richard Black, Timothy Zhu Measurement and Analysis of TCP Throughput Collapse in Cluster-based Storage Systems. Amar Phanishayee, Elie Krevat, Vijay Vasudevan, David G. Andersen, Gregory R. Ganger, Garth A. Gibson, Srinivasan Seshan. (FAST'08). |
Wed, Oct 30 | Consistency in the SmallSoft Updates: A Technique for Eliminating Most Synchronous Writes in the Fast Filesystem Marshall Kirk McKusick, Gregory R. Ganger (USENIX ATC'99) Generalized File System Dependencies Christopher Frost, Mike Mammarella, Eddie Kohler, Andrew de los Reyes, Shant Hovsepian, Andrew Matsuoka, and Lei Zhang (SOSP 2007) |
Mon, Nov 4 |
Recon: Verifying File System Consistency at Runtime
Daniel Fryer, Kuei Sun, Rahat Mahmood, TingHao Cheng, Shaun Benjamin,
Ashvin Goel, and Angela Demke Brown (FAST'12) Optimistic Crash Consistency Vijay Chidambaram, Thanumalayan Sankaranarayana Pillai, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau. (SOSP'13) |
Wed, Nov 6 | Consistency in the Large (or not)When reading these papers, you should also be considering hte use of Chained replication in FAWN and the Google Spanner paper that we read earlier in the course.Dynamo: Amazon’s Highly Available Key-value Store Giuseppe DeCandia, Deniz Hastorun, Madan Jampani, Gunavardhan Kakulapati, Avinash Lakshman, Alex Pilchin, Swaminathan Sivasubramanian, Peter Vosshall and Werner Vogels (SOSP'07) Don’t Settle for Eventual: Scalable Causal Consistency for Wide-Area Storage with COPS Wyatt Lloyd, Michael J. Freedman, Michael Kaminsky, David G. Andersen (SOSP'11) |
Mon, Nov 11 | Rememberance day. |
Wed, Nov 13 | Slack space for the moment, and to be filled depending on what area people seem to have the most interest in. Failing that we'll probably do Ceph and Ursa minor on this day. |
Mon, Nov 18 | Distributed Compute RevisitedNaiad: A Timely Dataflow System. Derek G. Murray, Frank McSherry, Rebecca Isaacs, Michael Isard, Paul Barham, Martin Abadi. (SOSP'13)Discretized Streams: Fault-Tolerant Streaming Computation at Scale. Matei Zaharia, Tathagata Das, Haoyuan Li, Timothy Hunter, Scott Shenker, Ion Stoica. (SOSP'13) |
Wed, Nov 20 |
Rhea: Automatic Filtering for Unstructured
Cloud Storage. Rhea: Automatic Filtering for Unstructured Cloud
Storage. (NSDI'13) Robustness in the Salus Scalable Block Store. Yang Wang, Manos Kapritsos, Zuocheng Ren, Prince Mahajan, Jeevitha Kirubanandam, Lorenzo Alvisi, and Mike Dahlin.(NSDI'13) |
Mon, Nov 25 | Demos |
Wed, Nov 27 | Demos |
538W is a seminar-style course that also happens to qualify as a systems breadth requirement for UBC graduate students. This means that, accordingly, you will need to do two things to do well in the course: (1) Participate in seminar discussions, and build an interesting system. If you are unable to contribute constructively to a classroom discussion with 10-20 of your peers, or you are uncomfortable spending quite a bit of time over the next four months building and presenting a nontrivial software project, this probably isn't the course for you! If you desire to hedge against the commitment of a large systems implementation project, you may write a brief (200 words or less), but incredibly good and readable article that articulates the problems faced in terms of dealing with large amounts of data within some specific domain. This may involve a combination of summarizing relevant publications and interviewing domain experts.
Marks are allocated as follows:
Reviews: You will be responsible for submitting conference program committee-style reviews into a local instance of hotcrp by 8pm, the evening before papers are discussed in class. Once submitted, you will be able to read any reviews that have been submitted by your peers.
Project write-ups are to be brief, five-page descriptions of the problem that your system is trying to solve, the approach to implementation, a set of results in applying or evaluating the system, and a summary of relevant related work.
HOTCRP: please create yourself an account on the hotcrp server, which is at http://hotcrp.nss.cs.ubc.ca/538W/. This is the system where you are responsible for entering reviews on the papers that we discuss in class.