Special Topics in Software Engineering: Dependable Software

ECE 1724, Fall 2006
University of Toronto

validate
Instructor: Ashvin Goel
Course Number: ECE 1724
Course Time: Thursday, 4-6 pm (note the change in room and time)
Course Room: GB 220
Start Date: Sep 14, 2006

Project Suggestions

Some suggested projects are described below.

Recovery via Restarting Applications

The "Microreboot" paper described a method by which parts of an application are rebooted to allow recovery of the application. This approach gets rid of faulty state in the application. In this project, you will choose either a content download application (e.g., bittorrent) or an instant messaging application (e.g., gaim) and implement a recovery via "reboot" method for this application. You need to make sure that the persistent data (e.g, the music repository or the instant messages received) in the application is not lost. How fine is your reboot granularity? Can you tune it? How often is reboot possible? What types of faults or bugs can the reboot handle? How does the reboot affect user perception?
Application-Level Undo and Recovery

The "Undo for Operators" paper implemented an undoable email service. In general, their application-level undo and recovery service requires applications whose operations have well-defined semantics and can be serialized. Another example that satisfies this criteria is a calendar service. Can you think of other such applications? Choose a calendar service or any one such application and implement an undoable service for that application. Describe the properties of this undoable application. How does application-specific recovery improve on generic recovery as described in the "Exploring Failure Transparency" paper?
Analysis of Failures in Real Applications

In this project you will study the bug reports of some open-source applications and determine the types of bugs that are reported for these applications. You can use the "Wither Generic Recovery" paper to guide your analysis. How does your analysis compare with the analysis of the paper. Next, similar to the "Rx" paper in the reading list, analyze these bugs in terms of the recovery methods you would use to survive these failures. Time permitting, implement one of these recovery mechanisms. For your study, choose a diverse set of applications where some are known to be relatively stable and others are buggy.
Misconfiguration Detection

The "PeerPressure" paper automatically detected misconfiguration in the Windows registry by comparing the registry entries across multiple machines. This comparison was done using a simple heuristic that determined whether a registry entry was very similar or dissimilar across machines. In this project, choose other heuristics such as clustering to determine misconfiguration. Compare this approach with the original PeerPressure approach. You can use any registry-like application.
Using Source-Code Control for One-Way Isolation

One-way isolation is a method for realizing a safe execution environment. One-way isolation separates the file-system operations of a process or a group of processes and allows either aborting or atomically committing these file-system operations to the file system at some point in the future. This approach allows testing a program in isolation. See the "One-Way Isolation" paper in the reading list. In this project, you will implement a safe execution environment using a source code control system (e.g., svn). How does your environment compare with the one-way isolation method, i.e., what are the pros and cons of your approach? How would you use your environment for testing system configuration? What other applications can you implement using your environment? Implement one such application.
Taint Analysis for Kernel-Level Intrusion Detection

Several intrusion response papers in the reading list (e.g, "Dynamic Taint Analysis", "Fast and Automated Generation", "Vigilante") use a taint analysis method for detecting intrusions. This method essentially determines whether data in a network packet or data that depends on a network packet is ever executed. The dependence is established at the machine instruction level. Most of this work has been done for applications. Use an available taint analysis tool to instrument an operating system. Run this system in a virtual machine as a honeypot to test for operating-system level intrusions.
Intrusion Analysis via Replay

Replay frameworks have several applications such as debugging applications as well as analyzing intrusions. In this project, you will use an existing replay framework to analyze intrusions. Choose a common application such as a web browser that has known vulnerabilities and use the replay framework to replay the application. Similar to Rx, replay the application in a different environment (e.g., different address space layout) to test if the attack was successful (e.g., buffer overflow failed) by comparing the outputs of the application during the original application and the replay.

Special Topics in Software Engineering: Dependable Software ECE 1724, Fall 2006 University of Toronto

Project Suggestions

Special Topics in Software Engineering: Dependable Software

ECE 1724, Fall 2006
University of Toronto