Instructor: Ashvin Goel
Course Number: ECE1781H
Course Time: Feb, 1-3 pm
Course Room: BA4164
Start Date: Jan 11, 2019

Home
Accessing Papers
Presentation Format
Project Format
Project Ideas

Dependable Software Systems

ECE1781, Winter 2019
University of Toronto


Project Ideas

Some suggested projects are described below. Please talk to the instructor about more details regarding the projects. Please make sure to get a confirmation about any project from the instructor before starting the project.

It is important for you to have thought about the following questions regarding each project before starting any design and implementation: 1) what problem are you addressing, 2) what is interesting/novel about your approach, 3) what metrics and testing method will you use for evaluation, and 4) what results do you expect from the evaluation. In addition, each project below has questions that you should think about before pursuing the project.

  1. Bug Detection Using Symbolic Execution

    Several papers in the reading list use symbolic execution for detecting bugs. In this project, you will use any available symbolic execution tool (e.g., Klee, S2E) to detect bugs in some simple programs. On what basis will you choose an application? How will you know that you have detected a bug?
  2. Cross-Checking Semantic Correctness of Device Drivers

    The "Cross-checking Semantic Correctness: The Case of Finding File System Bugs." paper in the reading list (Week 2) is able to cross check semantic correctness by taking advantage of the implicit VFS specification in the Linux kernel. You could consider applying their tool, Juxta, for other applications, such as specific types of Linux device drivers (e.g., network drivers).
  3. Race Detection

    The goal of this project is to detect races in an existing application. You can use any available race detector (e.g., ThreadSanitizer). It will be best to use an application that provides an extensive testing framework, which you can use to trigger races. If the application has known fixes for race bugs, you can reintroduce the race and see if the race can be found.Is it easy to replicate bugs that are found? How can you use techniques we have discussed in class to increase the likelihood of catching races?
  4. Application-Level Undo and Recovery

    The "Undo for Operators" paper implemented an undoable email service. In general, their application-level undo and recovery service requires applications whose operations have well-defined semantics and can be serialized. Another example that satisfies this criteria is a calendar service. Can you think of other such applications? Choose an application and implement an undoable service for that application. Describe the properties of this undoable application. How does application-specific recovery improve on generic recovery as described in the "Exploring Failure Transparency" paper?
  5. Recovery via Restarting Applications

    The "Microreboot" paper described a method by which parts of an application are rebooted to allow recovery of the application. This approach gets rid of faulty state in the application. In this project, you will choose an application and implement a recovery via "reboot" method for this application. You need to make sure that the persistent data in the application is not lost. For example, for a content download application (e.g., bittorrent),the music repository must not be lost. Similarly, for an instant messaging application (e.g., gaim), the received messages should not be lost. How fine is your reboot granularity? Can you tune it? How often is reboot possible? What types of faults or bugs can the reboot handle? How does the reboot affect user perception? Would you change the application design based on your experience with micro-reboot based restarting.
  6. Database Replication and Failover

    Most web sites store their critical data in databases today. Thus database deployments often use backup and various failover mechanisms to improve availability and guard against catastrophic failures. The instructor's group has been working on a high-performance database that provides support for replication. In this project, you will be implementing and testing a replication-based failover scheme. Talk to the instructor for more details.
  7. File-System Reliability

    File system management tools, such as defragmentation tools, resizing tools, and partition editors, are essential for maintaining, optimizing, and administering file systems. Today, custom tools are written for each file system, because these tools require fine-grained control, such as the ability to migrate a data block to another specific physical location. The instructor's group has been working on a richer file system interface called EVFS that allows these tools to be written once and be usable for all file systems that support the interface. The EVFS approach improves the reliability of these tools and eases their development. In this project, you will be implementing the EVFS interface for a Linux file system, and testing it with some generic tools. Talk to the instructor for more details.