Special Topics in Software Engineering: Dependable Software

ECE 1724, Fall 2006
University of Toronto


Instructor: Ashvin Goel
Course Number: ECE 1724
Course Time: Thursday, 4-6 pm (note the change in room and time)
Course Room: GB 220
Start Date: Sep 14, 2006

Home Presentation Format Project Format Project Suggestions

Course Description

Modern computer systems have become tightly intertwined with our daily lives. However, they are complex, failure-prone and insecure and thus hardly dependable. Also, configuring and managing these systems is difficult and often results in decreased dependability and increased vulnerabilities. These problems have become even more severe with increased networking and with easy availability of inexpensive, powerful and embedded devices. While these dependability problems dominate cost of ownership of computer systems, unfortunately they have no simple solutions. There is a realization that these problems cannot be decisively solved but are ongoing facts of life that must be dealt with regularly. To do so, systems should be designed to detect, isolate and recover from these problems.

This advanced graduate-level course focuses on dependability in software systems and examines current research that aims to address challenges caused by software defects, intrusions and software misconfiguration. Students are expected to read and critique recent research papers in operating systems and networking that cover these areas. They are also expected to work on a research project and make class presentations. While there are no specific prerequisites for this course, students who have taken undergraduate or graduate courses in operating systems, networks and distributed systems will have an edge.


Textbooks

There are no required textbooks for this course. The optional textbooks are


Mailing List

Please subscribe to the class mailing list by joining this group. You will need a Yahoo account, although Yahoo will forward the group messages to any email address of your choice. The instructor will use this group to send instructions and reminders. All students who subscribe to the group can send email to the group by sending mail to this list. The group is not moderated. If you have a specific question for the instructor, please send an email to the instructor directly. For the first week of classes, you can join the group directly. After that the Yahoo groups website will require the instructor's approval to subscribe you.


Grading Policy

Grades will be based on class presentations, a class project, and class participation. There will be no final exam in this course. The grading breakup is as follows:

Note: If a student is unable to attend a class, he or she will lose 2% for non-participation.


Class Presentation

Each week this class will cover a group of papers that focuses on a specific aspect of the course. Students are expected to read all the papers in the group that will be presented. At the beginning of the term, each paper will be assigned to a student who will be presenting the paper. Presentations will be limited to 20 minutes.

More details about the presentation format. Please read very carefully.


Assignments

The instructor will provide you with details about assignments in class.


Class Project

A major component of this course is devoted to a term-long project. The topic of the project is largely up to you, but to help you choose a project, a sample list of projects is provided below. This list should help students determine whether their own projects are of reasonable size and scope.

More details about the project format. Please read very carefully.


Project Ideas

Here is a list of project ideas.


Readings

This is a tentative list. If a link to a paper is missing, please use a search engine to find the paper.

Week 1: Introduction (Sept 14)

  1. Why Do Computers Stop and What Can Be Done About It? SRDS 1986.
  2. Broad New OS Research: Challenges and Opportunities. HOTOS 2005.
  3. Introduction to Dependable Software Systems by Instructor.
  4. Efficient Readings of Papers in Science and Technology.
  5. How (and How Not) to Write a Good Systems Paper. Operating Systems Review 1983.

Week 2: Fault Isolation (Sep 21)

  1. Hypervisor-based Fault-tolerance. SOSP 1995.
  2. Hive: Fault Containment for Shared-Memory Multiprocessors. SOSP 1995.
  3. Dealing With Disaster: Surviving Misbehaved Kernel Extensions. OSDI 1996. Vladan
  4. Improving the Reliability of Commodity Operating Systems. SOSP 2003.
  5. Unmodified Device Driver Reuse and Improved System Dependability via Virtual Machines. OSDI 2004.
  6. XFI: Software Guards for System Address Spaces. OSDI 2006. Ramy

Week 3: Failure Recovery (Sep 28)

  1. Exploring Failure Transparency and the Limits of Generic Recovery. OSDI 2000.
  2. Undo for Operators: Building an Undoable E-mail Store. USENIX 2003.
  3. Recovering Device Drivers. OSDI 2004. Wei
  4. Enhancing Server Availability and Security Through Failure-Oblivious Computing. OSDI 2004. Josh

Week 4: Failure Recovery (Oct 5)

  1. Microreboot - A Technique for Cheap Recovery. OSDI 2004.
  2. Rx: Treating Bugs As Allergies---A Safe Method to Survive Software Failures. SOSP 2005. Mark
  3. The Taser Intrusion Recovery System. SOSP 2005. Shvetank
  4. SafeDrive: Safe and Recoverable Extensions Using Language-Based Techniques. OSDI 2006.

Week 5: Intrusion Analysis (Oct 12)

  1. ReVirt: Enabling Intrusion Analysis through Virtual-Machine Logging and Replay. OSDI 2002. Vladan
  2. Backtracking Intrusions. SOSP 2003. Wei

Week 6: Web Intrusion Analysis (Oct 19)

  1. A Crawler-based Study of Spyware on the Web. NDSS 2006. Shvetank
  2. Automated Web Patrol with Strider HoneyMonkeys: Finding Web Sites That Exploit Browser Vulnerabilities. NDSS 2006. Jimmy
  3. Modeling Botnet Propagation Using Time Zones. NDSS 2006.

Week 7: Intrusion Detection (Oct 26)

  1. On Gray-Box Program Tracking for Anomaly Detection. USENIX Security 2004.
  2. Detecting Past and Present Intrusions through Vulnerability-Specific Predicates. SOSP 2005.
  3. Behavior-based Spyware Detection. Usenix Security 2006. Faisal
  4. Dataflow Anomaly Detection. SSP 2006. Saeed

Week 8: Safe Execution (Nov 2)

  1. Secure Execution via Program Shepherding. USENIX Security 2002. Josh
  2. One-Way Isolation: An Effective Approach for Realizing Safe Execution Environments. NDSS 2005. Jing
  3. BrowserShield: Vulnerability-Driven Filtering of Dynamic HTML OSDI 2006.

Week 9: Safe Execution (Nov 9)

  1. Privtrans: Automatically Partitioning Programs for Privilege Separation. USENIX Security 2004.
  2. Taint-Enhanced Policy Enforcement: A Practical Approach to Defeat a Wide Range of Attacks. Usenix Security 2006.
  3. Making Information Flow Explicit in HiStar. OSDI 2006. Jing
  4. Splitting Interfaces: Making Trust Between Applications and Operating Systems Configurable. OSDI 2006. Mark

Week 10: Intrusion Response (Nov 16)

  1. Shield: Vulnerability-Driven Network Filters for Preventing Known Vulnerability Exploits. SIGCOMM 2004. Sergio
  2. Active Internet Traffic Filtering: Real-Time Response to Denial-of-Service Attacks. USENIX 2005.
  3. Vigilante: End-to-End Containment of Internet Worms. SOSP 2005. Mark

Week 11: Intrusion Response (Nov 23)

  1. Dynamic Taint Analysis for Automatic Detection, Analysis, and Signature Generation of Exploits on Commodity Software. NDSS 2005. Shvetank
  2. Automatic Diagnosis and Response to Memory Corruption Vulnerabilities. CCS 2005.
  3. Fast and Automated Generation of Attack Signatures: A Basis for Building Self-Protecting Servers. CCS 2005. Jimmy
  4. Argos: an Emulator for Fingerprinting Zero-Day Attacks. Eurosys 2006.

Week 12: System Misconfiguration (Nov 30)

  1. Understanding and Dealing with Operator Mistakes in Internet Services. OSDI 2004.
  2. Automatic Misconfiguration Troubleshooting with PeerPressure. OSDI 2004. Faisal
  3. Configuration Debugging as Search: Finding the Needle in the Haystack. OSDI 2004. Sergio
  4. Automated Known Problem Diagnosis with Event Traces Eurosys 2006.

Week 13: Performance Misconfiguration (Dec 7)

  1. Performance Debugging for Distributed Systems of Black Boxes. SOSP 2003.
  2. Correlating Instrumentation Data to System States: A Building Block for Automated Diagnosis and Control. OSDI 2004. Saeed
  3. Using Magpie for Request Extraction and Workload Modelling. OSDI 2004.
  4. Capturing, Indexing, Clustering, and Retrieving System History. SOSP 2005. Saeed