Distributed Systems
CPSC 416, Winter 2018
Mon/Wed/Fri 3-4:00PM, HDP 310, UBC course page
Course piazza
Office hours:
Matthew... Mon 16:00-17:00 (X151)
Renato.... Tue 15:00-16:00 (X151)
Gleb........ Wed 13:00-14:00 (X151)
Anna....... Thu 13:00-14:00 (X239)
Ivan......... Fri 10:00-11:00 (ICICS 327)
|
|
This course has completed. You may be looking
for CPSC
416 2018 W1
Course description
Leslie
Lamport, a computer scientist who won the 2013 ACM Turing Award,
gave the following definition of a distributed system:
A distributed system is one in which the
failure of a computer you didn't even know existed can render your
own computer unusable.
Yet, distribution provides numerous benefits. A system becomes more
fault tolerant if there are fewer points of failure and it has
no centralized components. By extending the system with more physical
nodes the system gains performance and becomes more scalable,
capable of handling more load. Distribution can also improve latency,
by improving geographic diversity, by placing resources closer
to clients who use the system.
Achieving these benefits is not easy. As the quote above
illustrates, distributed systems can fail in complex ways and these
systems are more difficult to build, test, and understand than
centralized systems.
This course will introduce you to a broad range of topics in
distributed systems. The tentative topics are listed in the schedule
below. For the most part this will be a lecture-style course. However,
distributed system concepts are notoriously challenging to internalize
without first-hand experience. The emphasis of this course, therefore,
will be on building distributed system prototypes, small and
large.
Course pre-requisites: CPSC 317 (networks) and CPSC 313
(computer hardware and operating systems).
Course staff: Ivan Beschastnikh (Instructor), Renato Costa (TA), Matthew Do (TA), Gleb Naumenko (TA), Anna Zheltukhina (TA).
Go programming language
In this course we will exclusively use
the Go programming language for all
assignments. Learning a new programming language is an important
skill. You will practice it in this course. For the most part I will
expect that you learn this language on your own.
Amanda and Stewart led an in-class Go tutorial in Winter 2017
version of the course. Here is the recorded
version: part
1,
and part
2.
Textbooks
There are three optional books for this course:
- Go Programming Language
- Programming in Go
- Distributed Systems: Principles and Paradigms (2nd Edition)
Although there are many tutorials introducing Go and the online Go
documentation is well developed, some of you may find the first two
books on the list helpful for a step-by-step introduction to Go.
Communication
Use the course
Piazza for all course-related communication. The Piazza also
supports private posts that you can use to communicate with the
instructor and the TAs.
Course-level learning goals
The course will provide an opportunity for participants to
- understand key principles in designing and implementing distributed systems
- reason about problems that involve distributed components
- become familiar with important techniques for solving problems that arise in distributed contexts
- build distributed system prototypes using the Go programming language
|
Schedule (a work in progress)
Jan 3 Wed |
Introduction and course overview
[slides]
Read through Go resources prior to class,
and practice as much Go as you can.
|
Jan 5 Fri |
Assignment 1 overview and networking 1/2
[slides]
|
Jan 8 Mon |
Networking 1/2 continued: network stack, routing
[slides]
|
Jan 10 Wed |
Networking 2/2 continued: fate sharing, e2e arguments, start of RPC
[slides]
|
Jan 12 Fri |
RPC
[slides]
|
Jan 15 Mon |
Assignment 1 due
Assignment 2 overview
|
Jan 17 Wed |
Assignment 2 solution sketch and edge-cases review
|
Jan 19 Fri |
Distributed file systems overview: NFS and AFS
[slides]
|
Jan 22 Mon |
Client-side caching, caching in NFS
[slides]
|
Jan 24 Wed |
Caching in AFS, dist. FS semantics (e.g., session semantics)
[slides]
|
Jan 26 Fri |
Distributed P2P ledger: BitCoin
[See last year notes]
|
Jan 29 Mon |
Project 1 posted
[in-class notes]
Assignment 2 due
|
Jan 31 Wed |
Peer to peer systems
[slides]
|
Feb 2 Fri |
Time synchronization
[slides]
|
Feb 5 Mon |
Logical time [Lamport and vector clocks]
[slides]
|
Feb 7 Wed |
Distributed mutual exclusion
[slides]
|
Feb 9 Fri |
Fault Tolerance, local faults
[slides]
|
Feb 12 Mon |
No class (Family Day); no office hours
|
Feb 14 Wed |
Fault Tolerance, local faults (continued)
[slides]
|
Feb 16 Fri |
RAID
[slides]
Project 1 due
|
Feb 19 Mon |
No class (UBC reading break); no office hours
Project 1 demos/marking Feb 19-23
Project 2 released
|
Feb 21 Wed |
No class (UBC reading break); no office hours
Project 1 demos/marking Feb 19-23
|
Feb 23 Fri |
No class (UBC reading break); no office hours
Project 1 demos/marking Feb 19-23
|
Feb 26 Mon |
RAID, continued
[slides]
|
Feb 28 Wed |
Primary backup replication
[slides]
|
Mar 2 Fri |
Transactions, part 1: ACID semantics and 2-phase locking
[slides]
Project 2 proposal drafts due
|
Mar 5 Mon |
Transactions continued, part 2: logging
[slides]
|
Mar 7 Wed |
Transactions continued, part 3: more logging
[slides]
|
Mar 9 Fri |
Two phase commit (2PC)
[slides]
Project 2 final proposals due
|
Mar 12 Mon |
2PC in other topologies
[slides]
|
Mar 14 Wed |
Three phase commit (3PC)
[slides1,
slides2]
|
Mar 16 Fri |
Quorum replication; Paxos protocol 1/3
[slides]
|
Mar 19 Mon |
Quorum replication; Paxos protocol 2/3
|
Mar 21 Wed |
Quorum replication; Paxos protocol 3/3
|
Mar 23 Fri |
Content Distribution Networks (CDNs)
[slides]
Project 2 group meetings with designated TA
|
Mar 26 Mon |
CAP theorem
[slides]
|
Mar 28 Wed |
Studying distributed systems with Dinv,
Talk by Stewart Grant. [slides]
|
Mar 30 Fri |
No class (Good Friday); no office hours
|
Apr 2 Mon |
No class (Easter); no office hours
|
Apr 4 Wed |
Cross-Cloud: What worked, what failed and lessons learned.
Talk by Diego Casati.
[slides,
details]
|
Apr 6 Fri |
Distributed systems design considerations [slides]
Project 2 code and final reports due
Last day of class
|
Apr 9-20 |
Project 2 demos/marking
|
Apr 16 |
Final exam at 8:30 AM. Room TBD.
|
|
Go resources
Go is a systems language designed at
Google. It is especially well suited to building distributed
systems. Like with any language, the fastest way to become proficient
at Go is to put in the time writing programs in Go. Here are some
resources to get you started:
We will be using Go version 1.9.2 (the most recent version).
|
Assignments
There are two assignments. All assignments must be completed in Go and
you must work on them individually.
Solution must be submitted using the stash server by 11:59PM of the
day of the deadline. Special instructions for compiling/running the
code should be included as a README.txt file.
Assignments will be primarily marked based on functionality. Some
partial marks will be given to assignments that partially fulfill the
specifications, but this is done at our discretion. It is therefore in
your best interest to submit a complete solution. We also encourage
you to properly document
and gofmt your
code.
To access the hand-in git repository for assignment X as
student with undergrad userid UID, run the following command:
git clone
https://stash.ugrad.cs.ubc.ca:8443/git/CS416_2017W2_/asX_UID.git
Add your solution (and don't forget to push!) to the repository by the
deadline.
Assignment deadlines are listed in the schedule above and
below. Assignment descriptions will be linked to from this page once
they are available.
|
Project 1
Project 1 is a larger assignment that must be done in a group of 4
students and must be deployed on Azure.
Deliverables
All project 1 deliverables are due at 11:59PM on their respective dates.
- Implementation. We expect your repository to include a
detailed README file that explains the design of your
implementation.
- Project demo: a 20-minute private demo of your project to
the instructor/group TA, including a technical Q/A regarding the
project design and implementation.
- The stash project repositories will be frozen after you submit
your code. You must use this frozen code to demo your
project
|
Project 2
Project 2 is an open-ended project
that must be done in a team of 3-5 people and must be (at least
partially) deployed on Azure.
Deliverables
All project 2 deliverables are due at 11:59PM on their respective dates.
- Project proposal: a paper detailing the problem, your
proposed approach/solution, a realistic timeline for your team,
and a SWOT
analysis for your team.
- Prototype implementation: must involve substantial
development effort. The prototype git repository must be shared
with the course staff.
- Project report: a paper detailing the problem, your
approach/solution, design of your prototype, and an evaluation of
the prototype.
- Project demo: a TBD-minute private demo of your project to
the instructor/group TA, including a technical Q/A regarding the
project design and implementation.
|
Exam
To practice for the exam we will go over 1-3 questions at the start of
each class. You can also download the complete set of
practice questions we have covered thus far (updated
continuously).
Final exam Monday, Apr 16 at 08:30 (AM). Room TBD.
|
Grading
Final course mark will be based off of:
- Assignment 1: 5% (+2% extra credit)
- Assignment 2: 20%
- Project 1: 20%
- Code: 10%
- Demo: 10%
- Peer review multiplier
- Project 2: 35%
- Proposal: 10%
- Report and code: 15%
- Demo: 10%
- Peer review multiplier
- Final exam: 20%
Note that the assignments are individual efforts, while the two
projects must be team efforts.
Late policy
The deadline for any assignment can be extended by one day with a 20%
penalty to the mark. Assignments will not be accepted 24 hours
past the original deadline.
Deadline for project 1 can be extended under the same terms as the
assignments.
Deadlines for project 2 cannot be extended.
If you have an emergency (e.g., health) that prevents you from meeting
a deadline. You must notify the instructor before the deadline.
|
How to do well in this course
Learn Go early and practice it regularly. Learning a new
language while being time constrained is stressful and not fun. Since
the assignments rapidly increase in their difficulty, it will be to
your advantage to learn Go as quickly as possible and to learn it
well. The posted Go resources are a great starting
point, but reading is no substitute for practice, bug, debug,
practice, practice, bug, coffee, debug, practice, ...
Do not skimp on software engineering. Distributed systems are
hard. They are hard to understand, to build, to debug, to run, to
trace, to document, etc. Do not make your life any more difficult. Use
best practices from software engineering to help you in this
course. Write unit and integration tests, use version control,
document your code with comments, write small prototypes, refactor
your code, make your code readable and easy to run and debug. If you
fail to follow best practices, they will come back to bite you later
on. Unfortunately, this course will not explicitly teach you these
best practices, but you probably took a course that introduced you to
these concepts. If you have any questions, just ask us on Piazza.
Choose your teammates, wisely. Some assignments will depend
critically on your ability to work effectively with one other
student. You are responsible for resolving personal and technical
differences among teammates on your own. Let us know as early as
possible if you have team concerns, before they turn into crises.
Reach out for success. This is intended to be a challenging
fourth year course, but that does not mean that you have to work
through it on your own! The course piazza should be your first stop
for all technical questions. The course has specific office hours (see
top of page), but I and the TAs are flexible. Send any of us an email
to schedule a time to discuss the course, the assignments,
etc. University students often encounter setbacks from time to time
that can impact academic performance. Discuss your situation with us
or an academic advisor as early as possible. For help in addressing
mental or physical health concerns, including seeing a UBC counselor
or doctor, visit
this link.
|
Academic honesty and collaboration guidelines
The department has a detailed policy
regarding collaboration
and plagiarism. You must familiarize yourself with this policy.
|
Acknowledgments
Many of the materials used in this course are derived from CMU's
15-440: Distributed
Systems course from Spring 2014, and are used with permission from
the content authors.
|
|