Mammoth: A Peer-to-Peer File System
ID
TR-2003-11
Publishing date
June 14, 2002
Length
15 pages
Abstract
Peer-to-peer storage systems organize a collection of symmetric nodes to store data without the use of centralized data structures or operations. As a result, they can scale to many thousands of nodes and can be formed directly by clients. This paper describes the design and implementation Mammoth, which implements a traditional UNIX-like hierarchical file system in a peer-to-peer fashion. Each Mammoth node stores a potentially arbitrary collection of directories and files that may be replicated on multiple nodes. The only thing that links these nodes together is that each metadata object encodes the network addresses of the nodes that store it. Data is replicated by a background process whose operation is simplified by the fact that files are stored as journals of immutable versions. An optimistic replication technique is used to allow nodes to read and write whatever version of data they can access, while also ensuring consistency when nodes are connected. In the event of temporary failure, eventual consistency is achieved by ensuring that every replica of a directory or file metadata object receives all updates to the object, irrespective of delivery order. While an update is being propagated every node that receives it cooperates to ensure that the update is delivered, even if the original sender fails. Our prototype is implemented as a user-level NFS server. Its performance is comparable to a standard NFS server and it will be publicly available soon.