Mandate: Faculty, students, and occasional special guest stars meet monthly to present and discuss ongoing research, relevant conference proceedings, and current events in the dual areas of compilers and computer architecture.
Readings: Readings to consider for future meetings, in architecture and compilers
Date | Time/Place | Presenter | Topic/Reading |
May09 | 1:30-4:30pm, at IBM | Amy Wang, IBM Ivan Matosevic Cedomir Segulja |
Evaluation of Blue Gene/Q Hardware Support for Transactional Memories Efficient bottom-up heap analysis for symbolic path-based data access summaries Architectural Support for Synchronization-Free Deterministic Parallel Programming |
Date | Time/Place | Presenter | Topic/Reading |
Feb 29 | 1:30-4:30pm, at IBM | Alon Housfater, IBM Islam Atta Eric LaForest |
Workload Optimized Multi-Stream Compression Accelerator on an FPGA Parallel Tree Manipulation Octavo: a 550MHz FPGA-based CPU, Towards an App-Specific Multicore |
Feb 13 | 11am-1pm SFB560 | Sheng Ma | Supporting Efficient Collective Communication in NoCs |
Feb 15 | 12pm-2pm BA4287 | Ioan Stefanovici | Cosmic Rays Don't Strike Twice: Understanding the Nature of DRAM Errors and the Implications for System Design |
Date | Presenter | Topic/Reading |
Oct 20 | 1:30pm in SFB540 | Roch Archimbault, IBM on Infosphere Streams David Han, UofT on Reducing Branch Divergence in GPU Programs Sheng Ma, UofT on DBAR: An Efficient Routing Algorithm to Support Multiple Concurrent Applications in Networks-on-Chip |
Nov07-10 | CASCON at Hilton Suites, Markham |
Date | Presenter | Topic/Reading |
Jan 10 | Cedomir | Tolerating Concurrency Bugs Using Transactions as Lifeguards Paper |
Jan 17 | (cancelled) | |
Jan 24 | Chuck | Twin peaks: a software platform for heterogeneous computing on general-purpose and graphics processors ACM page |
Jan 31 | (cancelled) | |
Feb 7 | Mihai | An Efficient Software Transactional Memory Using Commit-Time Invalidation Slides Paper ACM Page |
Feb 14 | Rob | Aergia: Exploiting Packet Latency Slack in On-Chip Networks Paper |
Feb 21 | (cancelled) | Family Day |
Feb 28 | Davor | "JIT Compilation to Commodity FPGAs: Challenges and Opportunities" |
Mar 7 | Diego | Semi-automatic extraction and exploitation of hierarchical pipeline parallelism using profiling information ACM page |
Mar 14 | Jason | RCDC: A Relaxed-Consistency Deterministic Computer Author Page |
Mar 21 | (cancelled) | |
Mar 28 | (cancelled) | |
Apr 4 | David | Single-chip Heterogeneous Computing: Does the future include Custom Logic, FPGAs, and GPUs? ACM Page |
Apr 11 | Ivan | OoOJava: Software Out-of-Order Execution Paper |
Apr 18 | Mark | Understanding Bloom Filter Intersection for Lazy Address-Set Disambiguation (Practise talk for SPAA) |
Apr 25 | Haojun | |
TBD | Eric | |
TBD | Sam |
Date | Presenter | Topic/Reading |
Sep 13 | (cancelled) | Moved to Thursday Sep. 16 |
Sep 16 | SBF560, 1:30pm | IBM Visiting Talk + talks by Cedomir and Jason |
Sep 20 | (cancelled) | |
Sep 27 | cedomir | WiDGET: Wisconsin Decoupled Grid Execution Tiles Paper |
Oct 04 | ivan | Accelerating Critical Section Execution with Asymmetric Multi-Core Architectures Paper |
Oct 11 | (cancelled) | Thanksgiving |
Oct 18 | mihai | Speculative Optimizations for Parallel Programs on Multicores Paper |
Oct 25 | diego | The Paralax infrastructure: automatic parallelization with a helping hand ACM Page |
Nov 01 | (cancelled) | CASCON 2010 |
Nov 08 | (cancelled) | |
Nov 15 | haojun | Flexible Architectural Support for Fine-Grain Scheduling ACM Page |
Nov 22 | rob | Forwardflow: A Scalable Core for Power-Constrained CMPs ACM Page |
Nov 29 | utku | Practice Presentation for MICRO: Hardware Support For Relaxed Concurrency in Transactional Memory |
Dec 06 | sam | Architecture of the Intel Single Chip Cloud Slides |
Dec 13 | jason | Understanding Sources of Inefficiency in General-Purpose Chips Paper Slides |
TBD | davor | |
TBD | chuck |
Date | Presenter | Topic/Reading |
May 03 | jason | Rethinking DRAM Design and Organization for Energy-Constrainted Multi-Cores Paper |
May 10 | utku | Boosting Single-Thread Performance in Multi-core Systems through Fine-Grain Multi-Threading Paper |
May 17 | (cancelled) | First Local Workshop on Better Programming Models for FPGAs |
May 24 | (cancelled) | Victoria Day |
May 31 | ivan | Exploiting Statistical Correlations for Proactive Prediction of Program Behaviors Paper |
Jun 07 | danyao | DART: Fast and Flexible NoC Simulation using FPGAs (Practice for ISCA) |
Jun 14 | cedomir | Kendo: Efficient Deterministic Multithreading in Software Paper |
Jun 21 | mihai | Software Data Spreading: Leveraging Distributed Caches to Improve Single Thread Performance ACM Page |
Jun 28 | diego | Concurrency Checking with CHESS: Learning from Experience Slides |
Jul 05 | (cancelled) | |
Jul 12 | chuck | A Real System Evaluation of Hardware Atomicity for Software Speculation Paper |
Jul 19 | davor | DMP: deterministic shared memory multiprocessing ACM page |
Jul 26 | (cancelled) | |
Aug 02 | (cancelled) | Civic Holiday |
Aug 09 | haojun | FPGA Partial Reconfiguration |
Aug 16 | rob | Elastic Cooperative Caching: An Autonomous Dynamically Adaptive Memory Hierarchy for Chip Multiprocessor Paper |
Aug 23 | davidhan | Debunking the 100x GPU vs. CPU Myth: An Evaluation of Throughput Computing on CPU and GPU ACM page (alt. PDF) |
Aug 30 | eric | C is for circuits: capturing FPGA circuits as sequential code for portability ACM page |
Date | Presenter | Topic/Reading |
Jan 11 | ivan | SigRace: Signature-Based Data Race Detection Paper |
Jan 18 | davidhan | PetaBricks: A Language and Compiler for Algorithmic Choice Paper |
Jan 25 | (cancelled) | |
cedomir | Helper locks for fork-join parallel programming Paper | |
mihai | Stretching Transactional Memory Paper | |
Feb 15 | (cancelled: Reading Week) | |
Feb 22 | davor | Performance and power of cache-based reconfigurable computing Paper |
Mar 01 | (cancelled: practice talk) | |
Mar 08 | martinl | Application-Specific Signatures for Transactional Memory in Soft Processors Paper |
Mar 15 | borys | A Lightweight In-Place Implementation for Software Thread-Level Speculation Paper |
Mar 22 | diego | Parallelizing sequential applications on commodity hardware using a low-cost software transactional memory Paper |
Mar 29 | chuck | Does Cache Sharing on Modern CMP Matter to the Performance of Contemporary Multithreaded Programs? Paper |
Apr 05 | danyao | Polymorphic On-Chip Networks Paper |
Apr 12 | (cancelled) | |
Apr 19 | myrto | Inter-Core Cooperative TLB Prefetchers for Chip Multiprocessors Paper |
Apr 26 | (cancelled: CGO conference) |
Date | Presenter | Topic/Reading |
Sep21 | myrto | Thread Criticality Predictors for Dynamic Performance, Power, and Resource Management in Chip Multiprocessors Paper |
Sep28 | danyao | A memory system design framework: creating smart memories Paper |
Oct05 | chuck | Binary Analysis for Measurement and Attribution of Program Performance Paper |
Oct12 | (Thanksgiving: no meeting) | |
Oct16 | (Friday visit to IBM) | Bob Blainey: Practical experience in building scalable data structures in Amino
Abstract
Steven Birk: Parallelizing FPGA CAD Using Transactional Memory Abstract |
Oct19 | jason | Scaling the Bandwidth Wall: Challenges in and Avenues for CMP Scaling Paper |
Oct26 | eric | Decoupled DIMM: Building High-Bandwidth Memory System Using Low-Speed DRAM Devices Paper |
Nov02 | (CASCON: no meeting) | 8th Workshop on Compiler-Driven Performance
Program
Workshop: Challenges for Parallel Computing Program |
Nov09 | (CANCELLED: no meeting) | |
Nov16 | (CANCELLED: no meeting) | |
Nov23 | (CANCELLED: no meeting) | |
Nov30 | CIDER/CARG Joint Meeting | |
Dec7 | utku | Simultaneous speculative threading: a novel pipeline architecture implemented in Sun's Rock processor Paper |
Date | Presenter | Topic/Reading |
May06 | davidtam | Ubiquitous Memory Introspection Abstract Paper Slides |
May13 | (no meeting) | |
May20 | eric | SoC-C: efficient programming abstractions for heterogeneous multicore systems on chip ACM page |
May27 | jason | A Tagless Coherence Directory Paper |
Jun03 | reza | OpenMP to GPGPU: a compiler framework for automatic translation and optimization ACM page |
Jun10 | chuck | Tolerating Delinquent Loads with Speculative Execution Paper Slides |
Jun17 | utku | Dependence-Aware Transactional Memory for Increased Concurrency Paper Slides |
Jun24 | ivan | Early experience with a commercial hardware transactional memory implementation ACM page |
Jul01 | (Canada Day: no meeting) | |
Jul08 | davidhan | Future-Proof Data Parallel Algorithms and Software on Intel Multi-Core Architecture Paper |
Jul15 | mihai | Is Transactional Programming Actually Easier? Conference Paper Author's Page (same paper) |
Jul22 | davor | Architectural implications of nanoscale integrated sensing and computing ACM page |
Jul29 | (no meeting) | |
Aug05 | (no meeting) | |
Aug12 | diego | Towards a holistic approach to auto-parallelization: integrating profile-driven parallelism detection and machine-learning based mapping ACM page |
Aug19 | martinl | Fast Critical Sections via Thread Scheduling for FPGA-based Multithreaded Processors Abstract |
Aug26 | borys | The Use of Hardware Transactional Memory for the Trace-Based Parallelization of Recursive Java Programs Paper |
Sep02 | steven | Parallelizing FPGA CAD Using Transactional Memory |
TBD | myrto | |
TBD | danyao | |
TBD | kaveh |
Date | Presenter | Topic/Reading |
May05 | borys | Efficient Context-Sensitive Shape Analysis with Graph Based Heap Models |
May12 | (no meeting) | |
May19 | (no meeting) | |
May26 | livio | SoftSig: Software-Exposed Hardware Signatures for Code Analysis and Optimization |
Jun02 | (no meeting) | |
Jun09 | utku | Split Hardware Transactions |
Jun16 | eric | An Adaptive And Scalable Multiprocessor System For Xilinx FPGAs Using Minimal Sized Processor Core |
Jun23 | (no meeting) | |
Jun30 | (no meeting) | |
Jul07 | jason | Accelerating Two-Dimensional Page Walks for Virtualized Systems |
Jul14 | ioana | Atom-Aid: Surviving and Detecting Atomicity Violations |
Jul21 | (no meeting) | |
Jul28 | Babak Falsafi (2pm SFB560) | ProtoFlex: A Hybrid Full-System Emulator for Large-Scale Multiprocessors |
Aug04 | (no meeting--holiday) | |
Aug11 | ivan | Inferring Locks for Atomic Sections |
Aug18 | mihai | Software Transactional Memory for Large Clusters |
Aug25 | davidhan | Program Optimization Space Pruning for a Multithreaded GPU |
Date | Presenter | Topic/Reading |
Sep5 | (no meeting) | |
Sep12 | (no meeting) | |
Sep19 | (no meeting) | |
Sep26 | (no meeting) | |
Sep28 | IBM Visit, 1:30pm SFB560 | |
Oct3 | ivan | BulkSC: bulk enforcement of sequential consistency |
Oct10 | jeremy | A Study of a Transactional Parallel Routing Algorithm |
Oct17 | chuck | CAS CDP practice talk |
Oct22 | CASCON: | CDP workshop |
Oct24 | (no meeting: CASCON!) | |
Oct31 | (no meeting) | |
Nov7 | kaveh | Matrix Scheduler Reloaded |
Nov14 | ioana | Emulating Optimal Replacement with a Shepherd Cache |
Nov21 | davidtam | MetaTM/TxLinux: Transactional Memory For An Operating System |
Nov28 | myrto | Virtual Hierarchies to Support Server Consolidation |
Dec5 | utku | |
Dec12 | borys | Exterminator: Automatically Correcting Memory Errors with High Probability |
Date | Presenter | Topic/Reading |
Sep20 | davor | Architectural Semantics for Practical Transactional Memory and The Atomos Transactional Programming Language |
Sep27 | ivan | POSH: a TLS compiler that exploits program structure |
Oct4 | utku | Constructing Virtual Architectures on a Tiled Processor |
Oct11 | patrick | Ultra Low-Cost Defect Protection for Microprocessor Pipelines |
Oct18 | (no meeting---CASCON) | TBA |
Oct25 | borys | A Decoupled KILO-Instruction Processor |
Nov1 | chuck | Compiler and runtime support for efficient software transactional memory |
Nov8 | greg | Tolerating Dependences Between Large Speculative Threads Via Sub-Threads |
Nov15 | mihai | Hybrid Transactional Memory |
Nov22 | marek | Software-based instruction caching for embedded processors |
Nov29 | martin | Custom Code Generation for Soft Processors |
Dec6 | ioana | A Case for MLP-Aware Cache Replacement |
Dec13 | (no meeting) |
Date | Presenter | Topic/Reading |
May3 | (no meeting) | |
May10 | ivan | CGO Review |
May17 | utku | PPoPP Review (meet in SFB560!) |
May24 | patrick | BranchTap: Improving Performance with Very Few Checkpoints Through Adaptive Speculation Control. |
May31 | marek | Program Demultiplexing |
Jun7 | borys | Profiling over Adaptive Ranges |
Jun14 | (no meeting) | |
Jun21 | (no meeting) | |
Jun28 | chuck | WTM and TRANSACT workshop reviews |
Jul5 | mihai | McRT-STM |
Jul12 | ioana & jason | ISCA06 review |
Jul19 | davidtam | Cooperative Caching for Chip Multiprocessors |
Jul26 | martin | Bulk disambiguation of speculative threads |
Aug2 | (no meeting) | |
Aug9 | livio | TMA: A Trap-Based Memory Architecture |
Aug15 | IBM meeting (at IBM) | David Tam, Reza Azimi, Alan Adamson (IBM) |
Aug16 | (no meeting) | TBA |
Aug23 | reza | Predicting Inter-Thread Cache Contention on a CMP |
Date | Presenter | Topic/Reading |
Dec13 | Josep Torellas (at 11am, BA1190) | TBA |
Dec6 | reza | Garbage Collection Without Paging |
Nov29 | chuck | Characterization of TCC on Chip-Multiprocessors |
Nov22 | borys | The V-Way Cache: Demand Based Associativity via Global Replacement |
Nov15 | (no meeting) | |
Nov8 | kirk | Compiling for Cell |
Nov1 | andreas | RegionScout |
Oct25 | mihai | Threads cannot be implemented as library |
Oct18 | no meet (cascon) | |
Oct11(a) | Ben | Alternative Dispatch Techniques for the Tcl VM |
Oct11(b) | Mattthew | Mixed Mode Execution with Context Threading |
Oct4 | davidtam | Maximizing CMP Throughput with Mediocre Cores |
Sep27 | jason | Transparent Instruction Set Customization |
Sep20 | martin | How to Fake 1000 Registers |
Sep13 | (IBM, no meeting) |
Date | Presenter | Topic | Reading |
Aug30 | greg | Optimistic Intra-transaction Parallelism | 1 |
Aug23 | ivan | Software watermarking | 1 |
Aug16 | utku | Compiler Optimization Space | 1 |
Aug9 | stanley | Temporal Streaming | 1 |
Aug2 | jeff | Pool Allocation | 1 |
Jul26 | chuck | Programming with Transactions | 1 |
Jul19 | ioana | Transition Phase Classification/Prediction | 1 |
Jul12 | davidtam | Multicore interconnects | 1 |
Jul5 | borys | TBA | |
Jun28 | (no meeting) | TBA | |
Jun21 | (no meeting) | TBA | |
Jun14 | petery | Soft Processors | 1 |
Jun7 | reza | AccMon | 1 |
May31 | kirk | Superword Parallelism | 1 |
May24 | No meeting | TBA | |
May19 | IBM (UofT) | TBA | |
May17 | No meeting | TBA | |
May10 | No meeting | TBA | |
May3 | levon | Tracing Garbage Collection | 1 |
Date | Presenter | Topic | Reading |
Apr26 | ivan | TBA | Power Optimization for MLCA |
Apr19 | stanley | TLS cache locality | |
Apr12 | jeff | Probabilistic Pointer Analysis | 1 |
Apr5 | chuck | Software TLS | |
Mar29 | ioana | Region-Based Compilation for Java JIT | 1 |
Mar22 | No meeting (IBM) | TBA | |
Mar15 | matt | Context threading | 1 |
Mar8 | reza | Online analysis with Perf Counters | 1 |
Mar1 | david tam | SMT and CMP OS Power Management | 1 |
Feb22 | gokhan memik | Clumsy Packet Processors | abstract |
Feb15 | (reading week) | TBA | |
Feb8 | kirk | Automatic Task Formation | 1 |
Feb1 | martin | 10GB NIC | 1 |
Jan25 | petery | SPREE | 1 |
Jan18 | greg | Software TLS | 1 |
Date | Topic | Presenter | Reading |
dec6 | Ispike | Utku | 1 |
nov29 | Data Race Detection | Borys | 1,2 |
nov22 | Compiler Algorithm for Energy Reduction | Ivan | 1 |
nov15 | Inlining Java Native Calls | levon | abstract |
nov8 | Destination-Set Prediction | stanley | 1 |
nov1 | Transactional Coherence | chuck | 1 |
oct25 | Jumbo: runtime code gen for Java | mihai | 1,2 |
oct18 | ASPLOS report | ASPLOS attendees | ASPLOS site |
Oct 4 | Neural Branch Prediction | Ioana | 1 |
Sep 20 | Pointer Analysis | Jeff | 1 |
Date | Topic | Presenter | Reading |
August 31 | No meeting | ? | |
August 24 | Power awareness via dynamic optimization | Ioana | 1 |
August 17 | First-Order Superscalar Processor Model | Reza | |
August 10 | No meeting | ||
August 3 | No meeting | TBA | |
July 27 | CMPs with Execution Migration | David Tam | 1 |
July 20 | Decomposition for TLS | Kirk | 1 |
July 13 | Decompositions for Network Processing | Martin | (1,2) |
July 6 | No Meeting | TBA | |
June 29 | Compilation for TLS | Utku | TBA |
June 22 | No meeting | ||
June 15 | Garbage Collection Performance | Borys | 1 |
June 8 | Half-Price Architecture | Peter | 1 |
June 4 | IBM visit | TBA | |
June 1 | No meeting | ||
May 25 | CMP L2 Organizations | Greg | 1 |
May 18 | Vector Threads | Andrew | 1 |
May 11 | SMT P4 Observations | Ioana | 1 |
May 4 | Speculating to reduce unnecessary power consumption | Ivan | 1 |
Apr27 | Lifelong Program Analysis | Levon | 1 |
Date | Topic | Presenter | Reading |
Apr20 | Reusing Computation Results | Derek | 1 |
Apr13 | IBM visits Toronto | ||
Apr6 | Signature Buffer | Stanley | 1 |
Mar30 | Global History Buffer | Chuck | 1 |
Mar23 | Static Identification of Delinquent Loads | Mihai | 1 |
Mar16 | Interaction Costs | Jeff D. | 1 |
Mar2 | Optimizing Indirect Branch Prediction Accuracy in Virtual Machine Interpreters | Kirk | 1 |
Feb24 | Meta Optimization | Utku | 1 |
Feb3 | Scheduling for Custom Data Paths | Martin | 1 |
Feb6 | Dynamic Profiling: (visit to IBM); More info on "Hardware Counters" by David Tam |
Various | N/A |
Feb10 | Single-ISA Heterogeneous Multi-Core Architectures: The Potential for Processor Power Reduction | Borys | 1 |
Date | Topic | Presenter | Abstract | Reading | For Discussion |
Oct 1: | "Self Optimizing Libraries" | Hamza Karamali | abstract | 1,2 | Hardware vs Software |
Oct 8: | CASCON Compiler workshop | ||||
Oct 15: | "Exploiting ILP, TLP, and DLP with the Polymorphous TRIPS Architecture" | Greg Steffan | abstract | 1, (additional info) | Wireless Inter-chip Communication: 1, 2 |
Oct 29: | "XOM and XOMOS" | David Lie | abstract | 1 2 | Open source |
Nov 5: | "Hardware Support for Prescient Instruction Prefetch" | Tor Aamodt | N/A | economics of distributed computing | |
Dec 3: | "Introduction to OpenMP" | Mike Voss | OpenMP | ||
Dec 16: | Special meeting with IBM to discuss SMT, the Power 5, and current compiler research at UofT |