Stop me if you think you’ve heard this one before: fixed-point iteration and beyond in Naiad -Talk by Derek Murray

Date

Speaker: Derek Murray, Microsoft Research Silicon Valley lab

Host: Andrew Warfield

Title: Stop me if you think you’ve heard this one before: fixed-point iteration and beyond in Naiad

Abstract:
We are developing a new system for large-scale data analysis -- called "Naiad" -- which has the goal of supporting complex iterative queries over dynamic inputs at interactive timescales. Like many existing systems, Naiad supports high-level declarative queries, data-parallel execution, and transparent distribution. Unlike these systems, Naiad can efficiently execute queries with multiple (possibly nested) iterative loops, while simultaneously supporting low-latency incremental changes to the query inputs. To achieve this, Naiad generalizes traditional incremental dataflow to admit collections that vary in multiple independent dimensions, each corresponding to a distinct "reason" for which the collection may have changed. This flexibility allows far greater re-use of previous work when collections may change for multiple reasons, such as external stimuli and internal feedback. This is a talk in three parts. First, I will introduce "differential dataflow", which is the new computational framework that enables Naiad to compute iterations and incremental updates efficiently. I will go on to discuss how we have implemented Naiad as a decentralized distributed system, and how this lets the system scale even when the amount of work per increment is small. Finally, I will give a demonstration of how Naiad can be used to perform complex analytics interactively on a real-world social networking dataset. Naiad is joint work with Frank McSherry, Rebecca Isaacs, Michael Isard and Martín Abadi. For more details, see the project web page: https://www.microsoft.com/en-us/research/project/naiad/ and our blog on big data research: https://bigdataatsvc.wordpress.com/ . Bio: Derek Murray is a postdoc at the MSR Silicon Valley lab, where he pursues research interests in large-scale distributed and parallel computing. Prior to joining Microsoft in fall 2011, Derek was a PhD student in the Networks and Operating Systems group at the University of Cambridge, where he developed the CIEL execution engine and worked on various projects related to OS virtualization.