Stop me if you think you’ve heard this one before: fixed-point iteration and beyond in Naiad -Talk by Derek Murray

Date

Speaker: Derek Murray, Microsoft Research Silicon Valley lab

Host: Andrew Warfield

Title: Stop me if you think you’ve heard this one before: fixed-point iteration and beyond in Naiad

Abstract:
We are developing a new system for large-scale data analysis -- called "Naiad" -- which has the goal of supporting complex iterative queries over dynamic inputs at interactive timescales. Like many existing systems, Naiad supports high-level declarative queries, data-parallel execution, and transparent distribution. Unlike these systems, Naiad can efficiently execute queries with multiple (possibly nested) iterative loops, while simultaneously supporting low-latency incremental changes to the query inputs. To achieve this, Naiad generalizes traditional incremental dataflow to admit collections that vary in multiple independent dimensions, each corresponding to a distinct "reason" for which the collection may have changed. This flexibility allows far greater re-use of previous work when collections may change for multiple reasons, such as external stimuli and internal feedback. This is a talk in three parts. First, I will introduce "differential dataflow", which is the new computational framework that enables Naiad to compute iterations and incremental updates efficiently. I will go on to discuss how we have implemented Naiad as a decentralized distributed system, and how this lets the system scale even when the amount of work per increment is small. Finally, I will give a demonstration of how Naiad can be used to perform complex analytics interactively on a real-world social networking dataset. Naiad is joint work with Frank McSherry, Rebecca Isaacs, Michael Isard and Martín Abadi. For more details, see the project web page: https://www.microsoft.com/en-us/research/project/naiad/ and our blog on big data research: https://bigdataatsvc.wordpress.com/ . Bio: Derek Murray is a postdoc at the MSR Silicon Valley lab, where he pursues research interests in large-scale distributed and parallel computing. Prior to joining Microsoft in fall 2011, Derek was a PhD student in the Networks and Operating Systems group at the University of Cambridge, where he developed the CIEL execution engine and worked on various projects related to OS virtualization.

Tags