Designing Modern Web-Scale Applications

ECE1724, Winter 2022
University of Toronto

Instructor: Ashvin Goel
Course Time: Wed, 4-6:30 pm
Start Date: Jan 12, 2022
Online: Zoom (until end of January 2022)
Classroom: WB219 (from Feb 9, 2022)

Quick Links

HomePiazza DiscussionAccessing PapersPresentation FormatProject FormatProject Ideas

Project Ideas

Some suggested projects are described below. You are also welcome to pick a project of your own choice. However, it is important for you to think about the following questions regarding your project before starting any design and implementation:

  1. What problem are you addressing? You should think about the the main goals of the design and what you plan to achieve in this project.
  2. What is interesting/novel about your approach? One way to answer this question is to ask yourself the following: what question will this project answer that you do not know the answer to already, i.e., why do you need to spend time on this project?
  3. How would you know that you have achieved your goals? You need to think about the metrics and testing method that will you use for evaluation. You should also think about the expected results from the evaluation.

Your project reports and final presentation will be evaluated based on the criteria described above.

Several of the projects described below are based on work being done in the instructor's group. Please talk to the instructor about more details regarding the projects.

Please make sure to get a confirmation about your project from the instructor before starting the project.

  1. Caracal Distributed Database

    Many web applications store their large datasets in distributed databases today. The instructor's group has been working on a high-performance, in-memory distributed deterministic database called Caracal that is designed to scale well. In this project, you will be working on one of the projects shown below for improving the functionality and performance of Caracal. Caracal is implemented in C++, so you will need significant experience with C++ development.

    1. Intra-Transaction Dependencies in Distributed Transactions

      Currently, Caracal does not efficiently support distributed transactions that have intra-transaction dependencies across nodes. In this project, you will implement scheduling policies and transaction reordering methods that improve the performance of such transactions.

    2. Supporting Aborts in Distributed Transactions

      Currently, Caracal does not support aborts in distributed transactions. In this project, you will implement a deterministic abort mechanism for distributed transactions.

    3. Supporting Transactions with Unknown Write Sets

      Caracal is a deterministic database that requires transaction write sets before transaction execution. In this project, you will implement a speculative transaction execution mechanism that will determine the transaction write set by speculatively executing parts of the transaction and then rerun the transaction with the write set that is determined by the speculative mechanism.

    4. Supporting Distributed Priority Transactions

      Normally, Caracal batches transactions before executing them. Batching enabling parallelizing the transactions and handling contention more efficiently. However, batching increases transaction latency. We have implemented a mechanism called priority transactions that enables running certain transactions in Caracal with low latency. However, currently, priority transactions need to be single node transactions. In this project, you will implement distributed priority transacactions.

    5. Data Migration in Caracal

      Databases often manage long-term load imbalance across nodes by migrating data from hot nodes to other nodes. In this project, you will implement data migration in Caracal and evaluate the migration scheme using uniform and skewed workloads.

  2. Efficient Graph Mining

    Graph mining algorithms help discover complex structural patterns, such as cliques and motifs in graphs, enabling applications such as analyzing communities in social networks.

    In this project, you will be evaluating and comparing the performance of two state-of-the-art single-node graph mining systems, Peregrine and GraphPi. Based on this comparison, you will suggest methods for improving graph mining performance.

-->