Advances in Distributed Systems
ECE 1746, Fall 2004
University of Toronto
Course home page
Project Suggestions
- Implement a fault-tolerant distributed computation using ideas
from the Batch-Aware Distributed File System paper in the reading list.
Implement a large-scale distributed, perhaps scientific,
algorithm of
your choice. Test the fault-tolerant behavior
of your application, e.g., does the algorithm degrade gracefully if a
node crashes. Modify your algorithm so that it is fault-tolerant in the
face of node failures. One method of evaluation could be in terms of
progress made by the application after node crashes, e.g., does the
application make progress proportional to the number of surviving nodes
in the system.
- Build a replication service that uses gossiping
to
maintain eventually consistent replicated data.
In the eventual consistency model, replicas of data
eventually reach consistency but do allow intermediate, inconsistent
accesses. You can use gossiping between replicas to ensure
eventual consistency. Think about evaluation: how long is data
inconsistent in your system? Does it depend on the choice of gossiping
partners?
- Build a simple replication service on top of a distributed hash
table.
You can use pastry or chord as a distributed hash table. Implement a
simple replication service and show its benefits in terms of node
failures.
- Build an application-specific, highly available replication
service.
This replication service should be tailored to an
application's needs.
For example, you could implement a replication
service for configuration files for an application such as the Mozilla
web browser. Suppose that you access your email using Mozilla via the
IMAP protocol. This protocol keeps email folders consistent even though
you can access these folders from multiple clients and the folders are
cached on each client for performance. However, Mozilla does not
replicate its email address book so every client has its own address
book. Thus every time you access email from a new client, you have an
empty address book. For this project, you could implement a Mozilla
replicated address book service. Each client would have a replica of
the address book. These replicas can be inconsistent with each other.
An important benefit of
this service should be automatic reconciliation of divergent replicas.
For example, you could make a consistent address book by taking the
union of two inconsistent replicas. For evaluation, you could show the
benefits of this service in terms of high availability.
- Implement a service analogous to the Internet Indirection
Infrastructure.
The Internet Indirection Infrastructure is a powerful
mechanism for
separating the sender from the receiver. This approach allows building
mechanisms for mobility, multicasting and anycast. The question to
explore in this project is how scalable and fault tolerant is the
infrastructure. For
evaluation, build a dummy application to show scalability.
- Implement a simple, distributed file system.
The file system should have the equivalent of the open, close,
read and
write calls. The file data should be cooperatively cached and kept in
more than one node. Think about and implement your consistency policy
for reads and writes. Discuss how keeping data in more than one node
helps to provide fault tolerance. Your evaluation could be in terms of
performance improvement due to caching as well as the improved fault
tolerance due to multi-node caching.
- Implement a network-sensitive service discovery protocol.
The Network-Sensitive Service discovery protocol combines service
discovery with service selection. Currently, it implements service
selection based on latency and server load. In this project, you could
show how service selection can be based on available bandwidth. One
method of implementing bandwidth discovery is packet pairs. For
evaluation show how bandwidth based service selection improves service
performance.
- Implement a security protocol.
Implement a simplified secure http protocol called "httpgp" that uses
pgp for authentication and encryption. For evaluation, show how it
performs versus standard the http protocol.
- Implement a variant of Bit Torrent.
Bit torrent is a file sharing protocol that deals with flash crowds by
dynamically serving data from participants that are currently
downloading data, thus effectively forming a download tree of nodes
dynamically. Play with the protocol and quantify its performance.
Suggest improvements.
- Design and evaluate a transfer protocol for distributing large
objects.
Ideas you can incorporate are: swarming (transferring pieces of the
file from many sources in parallel), erasure or tornado coding (making
it so you don't care which pieces you grab, only that you grab
"enough"), and possibly scheduling (making people who share interests
wait a little while so they can organize into an overlay multicast
tree). Evaluation would include quantifying performance versus direct
downloading and quantifying the failure properties of the protocol.