Mandate: Faculty, students, and occasional special guest stars meet monthly to present and discuss ongoing research, relevant conference proceedings, and current events in the dual areas of compilers and computer architecture.
Readings: Readings to consider for future meetings, in architecture and compilers
| Date | Time/Place | Presenter | Topic/Reading |
| May09 | 1:30-4:30pm, at IBM | Amy Wang, IBM Ivan Matosevic Cedomir Segulja |
Evaluation of Blue Gene/Q Hardware Support for Transactional Memories Efficient bottom-up heap analysis for symbolic path-based data access summaries Architectural Support for Synchronization-Free Deterministic Parallel Programming |
| Date | Time/Place | Presenter | Topic/Reading |
| Feb 29 | 1:30-4:30pm, at IBM | Alon Housfater, IBM Islam Atta Eric LaForest |
Workload Optimized Multi-Stream Compression Accelerator on an FPGA Parallel Tree Manipulation Octavo: a 550MHz FPGA-based CPU, Towards an App-Specific Multicore |
| Feb 13 | 11am-1pm SFB560 | Sheng Ma | Supporting Efficient Collective Communication in NoCs |
| Feb 15 | 12pm-2pm BA4287 | Ioan Stefanovici | Cosmic Rays Don't Strike Twice: Understanding the Nature of DRAM Errors and the Implications for System Design |
| Date | Presenter | Topic/Reading |
| Oct 20 | 1:30pm in SFB540 | Roch Archimbault, IBM on Infosphere Streams David Han, UofT on Reducing Branch Divergence in GPU Programs Sheng Ma, UofT on DBAR: An Efficient Routing Algorithm to Support Multiple Concurrent Applications in Networks-on-Chip |
| Nov07-10 | CASCON at Hilton Suites, Markham |
| Date | Presenter | Topic/Reading |
| Jan 10 | Cedomir | Tolerating Concurrency Bugs Using Transactions as Lifeguards Paper |
| Jan 17 | (cancelled) | |
| Jan 24 | Chuck | Twin peaks: a software platform for heterogeneous computing on general-purpose and graphics processors ACM page |
| Jan 31 | (cancelled) | |
| Feb 7 | Mihai | An Efficient Software Transactional Memory Using Commit-Time Invalidation Slides Paper ACM Page |
| Feb 14 | Rob | Aergia: Exploiting Packet Latency Slack in On-Chip Networks Paper |
| Feb 21 | (cancelled) | Family Day |
| Feb 28 | Davor | "JIT Compilation to Commodity FPGAs: Challenges and Opportunities" |
| Mar 7 | Diego | Semi-automatic extraction and exploitation of hierarchical pipeline parallelism using profiling information ACM page |
| Mar 14 | Jason | RCDC: A Relaxed-Consistency Deterministic Computer Author Page |
| Mar 21 | (cancelled) | |
| Mar 28 | (cancelled) | |
| Apr 4 | David | Single-chip Heterogeneous Computing: Does the future include Custom Logic, FPGAs, and GPUs? ACM Page |
| Apr 11 | Ivan | OoOJava: Software Out-of-Order Execution Paper |
| Apr 18 | Mark | Understanding Bloom Filter Intersection for Lazy Address-Set Disambiguation (Practise talk for SPAA) |
| Apr 25 | Haojun | |
| TBD | Eric | |
| TBD | Sam |
| Date | Presenter | Topic/Reading |
| Sep 13 | (cancelled) | Moved to Thursday Sep. 16 |
| Sep 16 | SBF560, 1:30pm | IBM Visiting Talk + talks by Cedomir and Jason |
| Sep 20 | (cancelled) | |
| Sep 27 | cedomir | WiDGET: Wisconsin Decoupled Grid Execution Tiles Paper |
| Oct 04 | ivan | Accelerating Critical Section Execution with Asymmetric Multi-Core Architectures Paper |
| Oct 11 | (cancelled) | Thanksgiving |
| Oct 18 | mihai | Speculative Optimizations for Parallel Programs on Multicores Paper |
| Oct 25 | diego | The Paralax infrastructure: automatic parallelization with a helping hand ACM Page |
| Nov 01 | (cancelled) | CASCON 2010 |
| Nov 08 | (cancelled) | |
| Nov 15 | haojun | Flexible Architectural Support for Fine-Grain Scheduling ACM Page |
| Nov 22 | rob | Forwardflow: A Scalable Core for Power-Constrained CMPs ACM Page |
| Nov 29 | utku | Practice Presentation for MICRO: Hardware Support For Relaxed Concurrency in Transactional Memory |
| Dec 06 | sam | Architecture of the Intel Single Chip Cloud Slides |
| Dec 13 | jason | Understanding Sources of Inefficiency in General-Purpose Chips Paper Slides |
| TBD | davor | |
| TBD | chuck |
| Date | Presenter | Topic/Reading |
| May 03 | jason | Rethinking DRAM Design and Organization for Energy-Constrainted Multi-Cores Paper |
| May 10 | utku | Boosting Single-Thread Performance in Multi-core Systems through Fine-Grain Multi-Threading Paper |
| May 17 | (cancelled) | First Local Workshop on Better Programming Models for FPGAs |
| May 24 | (cancelled) | Victoria Day |
| May 31 | ivan | Exploiting Statistical Correlations for Proactive Prediction of Program Behaviors Paper |
| Jun 07 | danyao | DART: Fast and Flexible NoC Simulation using FPGAs (Practice for ISCA) |
| Jun 14 | cedomir | Kendo: Efficient Deterministic Multithreading in Software Paper |
| Jun 21 | mihai | Software Data Spreading: Leveraging Distributed Caches to Improve Single Thread Performance ACM Page |
| Jun 28 | diego | Concurrency Checking with CHESS: Learning from Experience Slides |
| Jul 05 | (cancelled) | |
| Jul 12 | chuck | A Real System Evaluation of Hardware Atomicity for Software Speculation Paper |
| Jul 19 | davor | DMP: deterministic shared memory multiprocessing ACM page |
| Jul 26 | (cancelled) | |
| Aug 02 | (cancelled) | Civic Holiday |
| Aug 09 | haojun | FPGA Partial Reconfiguration |
| Aug 16 | rob | Elastic Cooperative Caching: An Autonomous Dynamically Adaptive Memory Hierarchy for Chip Multiprocessor Paper |
| Aug 23 | davidhan | Debunking the 100x GPU vs. CPU Myth: An Evaluation of Throughput Computing on CPU and GPU ACM page (alt. PDF) |
| Aug 30 | eric | C is for circuits: capturing FPGA circuits as sequential code for portability ACM page |
| Date | Presenter | Topic/Reading |
| Jan 11 | ivan | SigRace: Signature-Based Data Race Detection Paper |
| Jan 18 | davidhan | PetaBricks: A Language and Compiler for Algorithmic Choice Paper |
| Jan 25 | (cancelled) | |
| cedomir | Helper locks for fork-join parallel programming Paper | |
| mihai | Stretching Transactional Memory Paper | |
| Feb 15 | (cancelled: Reading Week) | |
| Feb 22 | davor | Performance and power of cache-based reconfigurable computing Paper |
| Mar 01 | (cancelled: practice talk) | |
| Mar 08 | martinl | Application-Specific Signatures for Transactional Memory in Soft Processors Paper |
| Mar 15 | borys | A Lightweight In-Place Implementation for Software Thread-Level Speculation Paper |
| Mar 22 | diego | Parallelizing sequential applications on commodity hardware using a low-cost software transactional memory Paper |
| Mar 29 | chuck | Does Cache Sharing on Modern CMP Matter to the Performance of Contemporary Multithreaded Programs? Paper |
| Apr 05 | danyao | Polymorphic On-Chip Networks Paper |
| Apr 12 | (cancelled) | |
| Apr 19 | myrto | Inter-Core Cooperative TLB Prefetchers for Chip Multiprocessors Paper |
| Apr 26 | (cancelled: CGO conference) |
| Date | Presenter | Topic/Reading |
| Sep21 | myrto | Thread Criticality Predictors for Dynamic Performance, Power, and Resource Management in Chip Multiprocessors Paper |
| Sep28 | danyao | A memory system design framework: creating smart memories Paper |
| Oct05 | chuck | Binary Analysis for Measurement and Attribution of Program Performance Paper |
| Oct12 | (Thanksgiving: no meeting) | |
| Oct16 | (Friday visit to IBM) | Bob Blainey: Practical experience in building scalable data structures in Amino
Abstract
Steven Birk: Parallelizing FPGA CAD Using Transactional Memory Abstract |
| Oct19 | jason | Scaling the Bandwidth Wall: Challenges in and Avenues for CMP Scaling Paper |
| Oct26 | eric | Decoupled DIMM: Building High-Bandwidth Memory System Using Low-Speed DRAM Devices Paper |
| Nov02 | (CASCON: no meeting) | 8th Workshop on Compiler-Driven Performance
Program
Workshop: Challenges for Parallel Computing Program |
| Nov09 | (CANCELLED: no meeting) | |
| Nov16 | (CANCELLED: no meeting) | |
| Nov23 | (CANCELLED: no meeting) | |
| Nov30 | CIDER/CARG Joint Meeting | |
| Dec7 | utku | Simultaneous speculative threading: a novel pipeline architecture implemented in Sun's Rock processor Paper |
| Date | Presenter | Topic/Reading |
| May06 | davidtam | Ubiquitous Memory Introspection Abstract Paper Slides |
| May13 | (no meeting) | |
| May20 | eric | SoC-C: efficient programming abstractions for heterogeneous multicore systems on chip ACM page |
| May27 | jason | A Tagless Coherence Directory Paper |
| Jun03 | reza | OpenMP to GPGPU: a compiler framework for automatic translation and optimization ACM page |
| Jun10 | chuck | Tolerating Delinquent Loads with Speculative Execution Paper Slides |
| Jun17 | utku | Dependence-Aware Transactional Memory for Increased Concurrency Paper Slides |
| Jun24 | ivan | Early experience with a commercial hardware transactional memory implementation ACM page |
| Jul01 | (Canada Day: no meeting) | |
| Jul08 | davidhan | Future-Proof Data Parallel Algorithms and Software on Intel Multi-Core Architecture Paper |
| Jul15 | mihai | Is Transactional Programming Actually Easier? Conference Paper Author's Page (same paper) |
| Jul22 | davor | Architectural implications of nanoscale integrated sensing and computing ACM page |
| Jul29 | (no meeting) | |
| Aug05 | (no meeting) | |
| Aug12 | diego | Towards a holistic approach to auto-parallelization: integrating profile-driven parallelism detection and machine-learning based mapping ACM page |
| Aug19 | martinl | Fast Critical Sections via Thread Scheduling for FPGA-based Multithreaded Processors Abstract |
| Aug26 | borys | The Use of Hardware Transactional Memory for the Trace-Based Parallelization of Recursive Java Programs Paper |
| Sep02 | steven | Parallelizing FPGA CAD Using Transactional Memory |
| TBD | myrto | |
| TBD | danyao | |
| TBD | kaveh |
| Date | Presenter | Topic/Reading |
| May05 | borys | Efficient Context-Sensitive Shape Analysis with Graph Based Heap Models |
| May12 | (no meeting) | |
| May19 | (no meeting) | |
| May26 | livio | SoftSig: Software-Exposed Hardware Signatures for Code Analysis and Optimization |
| Jun02 | (no meeting) | |
| Jun09 | utku | Split Hardware Transactions |
| Jun16 | eric | An Adaptive And Scalable Multiprocessor System For Xilinx FPGAs Using Minimal Sized Processor Core |
| Jun23 | (no meeting) | |
| Jun30 | (no meeting) | |
| Jul07 | jason | Accelerating Two-Dimensional Page Walks for Virtualized Systems |
| Jul14 | ioana | Atom-Aid: Surviving and Detecting Atomicity Violations |
| Jul21 | (no meeting) | |
| Jul28 | Babak Falsafi (2pm SFB560) | ProtoFlex: A Hybrid Full-System Emulator for Large-Scale Multiprocessors |
| Aug04 | (no meeting--holiday) | |
| Aug11 | ivan | Inferring Locks for Atomic Sections |
| Aug18 | mihai | Software Transactional Memory for Large Clusters |
| Aug25 | davidhan | Program Optimization Space Pruning for a Multithreaded GPU |
| Date | Presenter | Topic/Reading |
| Sep5 | (no meeting) | |
| Sep12 | (no meeting) | |
| Sep19 | (no meeting) | |
| Sep26 | (no meeting) | |
| Sep28 | IBM Visit, 1:30pm SFB560 | |
| Oct3 | ivan | BulkSC: bulk enforcement of sequential consistency |
| Oct10 | jeremy | A Study of a Transactional Parallel Routing Algorithm |
| Oct17 | chuck | CAS CDP practice talk |
| Oct22 | CASCON: | CDP workshop |
| Oct24 | (no meeting: CASCON!) | |
| Oct31 | (no meeting) | |
| Nov7 | kaveh | Matrix Scheduler Reloaded |
| Nov14 | ioana | Emulating Optimal Replacement with a Shepherd Cache |
| Nov21 | davidtam | MetaTM/TxLinux: Transactional Memory For An Operating System |
| Nov28 | myrto | Virtual Hierarchies to Support Server Consolidation |
| Dec5 | utku | |
| Dec12 | borys | Exterminator: Automatically Correcting Memory Errors with High Probability |
| Date | Presenter | Topic/Reading |
| Sep20 | davor | Architectural Semantics for Practical Transactional Memory and The Atomos Transactional Programming Language |
| Sep27 | ivan | POSH: a TLS compiler that exploits program structure |
| Oct4 | utku | Constructing Virtual Architectures on a Tiled Processor |
| Oct11 | patrick | Ultra Low-Cost Defect Protection for Microprocessor Pipelines |
| Oct18 | (no meeting---CASCON) | TBA |
| Oct25 | borys | A Decoupled KILO-Instruction Processor |
| Nov1 | chuck | Compiler and runtime support for efficient software transactional memory |
| Nov8 | greg | Tolerating Dependences Between Large Speculative Threads Via Sub-Threads |
| Nov15 | mihai | Hybrid Transactional Memory |
| Nov22 | marek | Software-based instruction caching for embedded processors |
| Nov29 | martin | Custom Code Generation for Soft Processors |
| Dec6 | ioana | A Case for MLP-Aware Cache Replacement |
| Dec13 | (no meeting) |
| Date | Presenter | Topic/Reading |
| May3 | (no meeting) | |
| May10 | ivan | CGO Review |
| May17 | utku | PPoPP Review (meet in SFB560!) |
| May24 | patrick | BranchTap: Improving Performance with Very Few Checkpoints Through Adaptive Speculation Control. |
| May31 | marek | Program Demultiplexing |
| Jun7 | borys | Profiling over Adaptive Ranges |
| Jun14 | (no meeting) | |
| Jun21 | (no meeting) | |
| Jun28 | chuck | WTM and TRANSACT workshop reviews |
| Jul5 | mihai | McRT-STM |
| Jul12 | ioana & jason | ISCA06 review |
| Jul19 | davidtam | Cooperative Caching for Chip Multiprocessors |
| Jul26 | martin | Bulk disambiguation of speculative threads |
| Aug2 | (no meeting) | |
| Aug9 | livio | TMA: A Trap-Based Memory Architecture |
| Aug15 | IBM meeting (at IBM) | David Tam, Reza Azimi, Alan Adamson (IBM) |
| Aug16 | (no meeting) | TBA |
| Aug23 | reza | Predicting Inter-Thread Cache Contention on a CMP |
| Date | Presenter | Topic/Reading |
| Dec13 | Josep Torellas (at 11am, BA1190) | TBA |
| Dec6 | reza | Garbage Collection Without Paging |
| Nov29 | chuck | Characterization of TCC on Chip-Multiprocessors |
| Nov22 | borys | The V-Way Cache: Demand Based Associativity via Global Replacement |
| Nov15 | (no meeting) | |
| Nov8 | kirk | Compiling for Cell |
| Nov1 | andreas | RegionScout |
| Oct25 | mihai | Threads cannot be implemented as library |
| Oct18 | no meet (cascon) | |
| Oct11(a) | Ben | Alternative Dispatch Techniques for the Tcl VM |
| Oct11(b) | Mattthew | Mixed Mode Execution with Context Threading |
| Oct4 | davidtam | Maximizing CMP Throughput with Mediocre Cores |
| Sep27 | jason | Transparent Instruction Set Customization |
| Sep20 | martin | How to Fake 1000 Registers |
| Sep13 | (IBM, no meeting) |
| Date | Presenter | Topic | Reading |
| Aug30 | greg | Optimistic Intra-transaction Parallelism | 1 |
| Aug23 | ivan | Software watermarking | 1 |
| Aug16 | utku | Compiler Optimization Space | 1 |
| Aug9 | stanley | Temporal Streaming | 1 |
| Aug2 | jeff | Pool Allocation | 1 |
| Jul26 | chuck | Programming with Transactions | 1 |
| Jul19 | ioana | Transition Phase Classification/Prediction | 1 |
| Jul12 | davidtam | Multicore interconnects | 1 |
| Jul5 | borys | TBA | |
| Jun28 | (no meeting) | TBA | |
| Jun21 | (no meeting) | TBA | |
| Jun14 | petery | Soft Processors | 1 |
| Jun7 | reza | AccMon | 1 |
| May31 | kirk | Superword Parallelism | 1 |
| May24 | No meeting | TBA | |
| May19 | IBM (UofT) | TBA | |
| May17 | No meeting | TBA | |
| May10 | No meeting | TBA | |
| May3 | levon | Tracing Garbage Collection | 1 |
| Date | Presenter | Topic | Reading |
| Apr26 | ivan | TBA | Power Optimization for MLCA |
| Apr19 | stanley | TLS cache locality | |
| Apr12 | jeff | Probabilistic Pointer Analysis | 1 |
| Apr5 | chuck | Software TLS | |
| Mar29 | ioana | Region-Based Compilation for Java JIT | 1 |
| Mar22 | No meeting (IBM) | TBA | |
| Mar15 | matt | Context threading | 1 |
| Mar8 | reza | Online analysis with Perf Counters | 1 |
| Mar1 | david tam | SMT and CMP OS Power Management | 1 |
| Feb22 | gokhan memik | Clumsy Packet Processors | abstract |
| Feb15 | (reading week) | TBA | |
| Feb8 | kirk | Automatic Task Formation | 1 |
| Feb1 | martin | 10GB NIC | 1 |
| Jan25 | petery | SPREE | 1 |
| Jan18 | greg | Software TLS | 1 |
| Date | Topic | Presenter | Reading |
| dec6 | Ispike | Utku | 1 |
| nov29 | Data Race Detection | Borys | 1,2 |
| nov22 | Compiler Algorithm for Energy Reduction | Ivan | 1 |
| nov15 | Inlining Java Native Calls | levon | abstract |
| nov8 | Destination-Set Prediction | stanley | 1 |
| nov1 | Transactional Coherence | chuck | 1 |
| oct25 | Jumbo: runtime code gen for Java | mihai | 1,2 |
| oct18 | ASPLOS report | ASPLOS attendees | ASPLOS site |
| Oct 4 | Neural Branch Prediction | Ioana | 1 |
| Sep 20 | Pointer Analysis | Jeff | 1 |
| Date | Topic | Presenter | Reading |
| August 31 | No meeting | ? | |
| August 24 | Power awareness via dynamic optimization | Ioana | 1 |
| August 17 | First-Order Superscalar Processor Model | Reza | |
| August 10 | No meeting | ||
| August 3 | No meeting | TBA | |
| July 27 | CMPs with Execution Migration | David Tam | 1 |
| July 20 | Decomposition for TLS | Kirk | 1 |
| July 13 | Decompositions for Network Processing | Martin | (1,2) |
| July 6 | No Meeting | TBA | |
| June 29 | Compilation for TLS | Utku | TBA |
| June 22 | No meeting | ||
| June 15 | Garbage Collection Performance | Borys | 1 |
| June 8 | Half-Price Architecture | Peter | 1 |
| June 4 | IBM visit | TBA | |
| June 1 | No meeting | ||
| May 25 | CMP L2 Organizations | Greg | 1 |
| May 18 | Vector Threads | Andrew | 1 |
| May 11 | SMT P4 Observations | Ioana | 1 |
| May 4 | Speculating to reduce unnecessary power consumption | Ivan | 1 |
| Apr27 | Lifelong Program Analysis | Levon | 1 |
| Date | Topic | Presenter | Reading |
| Apr20 | Reusing Computation Results | Derek | 1 |
| Apr13 | IBM visits Toronto | ||
| Apr6 | Signature Buffer | Stanley | 1 |
| Mar30 | Global History Buffer | Chuck | 1 |
| Mar23 | Static Identification of Delinquent Loads | Mihai | 1 |
| Mar16 | Interaction Costs | Jeff D. | 1 |
| Mar2 | Optimizing Indirect Branch Prediction Accuracy in Virtual Machine Interpreters | Kirk | 1 |
| Feb24 | Meta Optimization | Utku | 1 |
| Feb3 | Scheduling for Custom Data Paths | Martin | 1 |
| Feb6 | Dynamic Profiling: (visit to IBM); More info on "Hardware Counters" by David Tam |
Various | N/A |
| Feb10 | Single-ISA Heterogeneous Multi-Core Architectures: The Potential for Processor Power Reduction | Borys | 1 |
| Date | Topic | Presenter | Abstract | Reading | For Discussion |
| Oct 1: | "Self Optimizing Libraries" | Hamza Karamali | abstract | 1,2 | Hardware vs Software |
| Oct 8: | CASCON Compiler workshop | ||||
| Oct 15: | "Exploiting ILP, TLP, and DLP with the Polymorphous TRIPS Architecture" | Greg Steffan | abstract | 1, (additional info) | Wireless Inter-chip Communication: 1, 2 |
| Oct 29: | "XOM and XOMOS" | David Lie | abstract | 1 2 | Open source |
| Nov 5: | "Hardware Support for Prescient Instruction Prefetch" | Tor Aamodt | N/A | economics of distributed computing | |
| Dec 3: | "Introduction to OpenMP" | Mike Voss | OpenMP | ||
| Dec 16: | Special meeting with IBM to discuss SMT, the Power 5, and current compiler research at UofT |