Compiler and Architecture Reading Group (CARG)

Regular Meetings: Mondays 12-1:30pm BA4287

Mandate: Faculty, students, and occasional special guest stars meet weekly to present and discuss ongoing research, relevant conference proceedings, and current events in the dual areas of compilers and computer architecture.

Readings: Readings to consider for future meetings, in architecture and compilers

Feedback: Suggest readings, volunteer to present, and arrange rides to IBM meetings here

You are Invited!

Please send email to Eric LaForest (laforest at eecg.toronto.edu) if you are:

Winter 2010

Feb 1Feb 8
Date Presenter Topic/Reading
Jan 11ivan SigRace: Signature-Based Data Race Detection Paper
Jan 18davidhan PetaBricks: A Language and Compiler for Algorithmic Choice Paper
Jan 25(cancelled)
cedomir Helper locks for fork-join parallel programming Paper
mihai Stretching Transactional Memory Paper
Feb 15davor
Feb 22steven
Mar 01borys
Mar 08martinl
Mar 15livio
Mar 22diego
Mar 29myrto
Apr 05danyao
Apr 12chuck Does Cache Sharing on Modern CMP Matter to the Performance of Contemporary Multithreaded Programs? Paper
Apr 19jason
Apr 26eric
May 03utku

Fall 2009

Date Presenter Topic/Reading
Sep21myrto Thread Criticality Predictors for Dynamic Performance, Power, and Resource Management in Chip Multiprocessors Paper
Sep28danyao A memory system design framework: creating smart memories Paper
Oct05chuck Binary Analysis for Measurement and Attribution of Program Performance Paper
Oct12(Thanksgiving: no meeting)
Oct16(Friday visit to IBM) Bob Blainey: Practical experience in building scalable data structures in Amino Abstract
Steven Birk: Parallelizing FPGA CAD Using Transactional Memory Abstract
Oct19jason Scaling the Bandwidth Wall: Challenges in and Avenues for CMP Scaling Paper
Oct26eric Decoupled DIMM: Building High-Bandwidth Memory System Using Low-Speed DRAM Devices Paper
Nov02(CASCON: no meeting) 8th Workshop on Compiler-Driven Performance Program
Workshop: Challenges for Parallel Computing Program
Nov09(CANCELLED: no meeting)
Nov16(CANCELLED: no meeting)
Nov23(CANCELLED: no meeting)
Nov30CIDER/CARG Joint Meeting
Dec7utku Simultaneous speculative threading: a novel pipeline architecture implemented in Sun's Rock processor Paper

Summer 2009

Date Presenter Topic/Reading
May06davidtam Ubiquitous Memory Introspection Abstract Paper Slides
May13(no meeting)
May20eric SoC-C: efficient programming abstractions for heterogeneous multicore systems on chip ACM page
May27jason A Tagless Coherence Directory Paper
Jun03reza OpenMP to GPGPU: a compiler framework for automatic translation and optimization ACM page
Jun10chuck Tolerating Delinquent Loads with Speculative Execution Paper Slides
Jun17utku Dependence-Aware Transactional Memory for Increased Concurrency Paper Slides
Jun24ivan Early experience with a commercial hardware transactional memory implementation ACM page
Jul01(Canada Day: no meeting)
Jul08davidhan Future-Proof Data Parallel Algorithms and Software on Intel Multi-Core Architecture Paper
Jul15mihai Is Transactional Programming Actually Easier? Conference Paper Author's Page (same paper)
Jul22davor Architectural implications of nanoscale integrated sensing and computing ACM page
Jul29(no meeting)
Aug05(no meeting)
Aug12diego Towards a holistic approach to auto-parallelization: integrating profile-driven parallelism detection and machine-learning based mapping ACM page
Aug19martinl Fast Critical Sections via Thread Scheduling for FPGA-based Multithreaded Processors Abstract
Aug26borys The Use of Hardware Transactional Memory for the Trace-Based Parallelization of Recursive Java Programs Paper
Sep02steven Parallelizing FPGA CAD Using Transactional Memory
TBDmyrto
TBDdanyao
TBDkaveh

Spring 2009

Date Presenter Topic/Reading
Jan07(no meeting)
Jan14jason Temporal Instruction Fetch Streaming
Jan21(no meeting)
Jan221:30pm SFB560: IBM visits UofTDavid Tam (UofT): RapidMRC: Approximating L2 Miss Rate Curves on IBM POWER5 Systems for Online Optimizations
Kit Barton (IBM): Improving Access to Shared Data in a Partitioned Global Address Space Programming Model
Jan28reza Merge: a programming model for heterogeneous multi-core systems
Feb04davor Kiwi: Synthesis of FPGA Circuits from Parallel Programs
Feb11utku Hardware Support For Serializable Transactions: A Study of Feasibility and Performance
Feb18(no meeting, reading week)
Feb25ivan Hardbound: architectural support for spatial safety of the C programming language
Mar04davidhan hiCUDA: A High-level Directive-based Language for GPU Programming
Mar11mihai Return of the Read-Write Lock
Mar18chuck C Trigger: exposing atomicity violation bugs from their hiding places.
Mar25steven Maximum Benefit from a Minimal HTM
Apr01kaveh "Low-Power, High-Performance Analog Neural Branch Prediction"
Apr08martin Adaptive transaction scheduling for transactional memory systems
Apr15(no meeting)
Apr22myrto Techniques for Bandwidth-Efficient Prefetching of Linked Data Structures in Hybrid Prefetching Systems
Apr29borys The Use of Hardware Transactional Memory for the Trace-Based Parallelization of Recursive Java Programs

Fall 2008

Date Presenter Topic/Reading
Sep03(no meeting)
Sep10martin Pipelined Execution of Critical Sections Using Software-Controlled Caching in Network Processors
Sep17jeremy Function level parallelism driven by data dependencies
Sep24kaveh Counting Dependence Predictors
Oct01(no meeting)
Oct08peter VESPA: Portable, Scalable, and Flexible FPGA-Based Vector Processors
Oct15chuck Thread-Safe Dynamic Binary Translation using Transactional Memory
Oct22myrto Prefetch-Aware DRAM Controllers
Oct29(no meeting: pact/cascon)
Nov05livio Reducing the Harmful Effects of Last-Level Cache Polluters with an OS-Level, Software-Only Pollute Buffer
Nov12borysLatency-tolerant software pipelining in a production compiler
Nov19(no meeting)
Nov26Clark Verbrugge, McGill There's Nothing Wrong with Out-of-Thin-Air: Compiler Optimization and Memory Models
Dec03Farshad Khunjush, U VictoriaArchitectural Enhancement for Message Passing Interconnects
Dec10eric A Library and Platform for FPGA Bitstream Manipulation
Dec17(no meeting)

Summer 2008

Date Presenter Topic/Reading
May05borys Efficient Context-Sensitive Shape Analysis with Graph Based Heap Models
May12(no meeting)
May19(no meeting)
May26livio SoftSig: Software-Exposed Hardware Signatures for Code Analysis and Optimization
Jun02(no meeting)
Jun09utku Split Hardware Transactions
Jun16eric An Adaptive And Scalable Multiprocessor System For Xilinx FPGAs Using Minimal Sized Processor Core
Jun23(no meeting)
Jun30(no meeting)
Jul07jason Accelerating Two-Dimensional Page Walks for Virtualized Systems
Jul14ioana Atom-Aid: Surviving and Detecting Atomicity Violations
Jul21(no meeting)
Jul28Babak Falsafi (2pm SFB560)ProtoFlex: A Hybrid Full-System Emulator for Large-Scale Multiprocessors
Aug04(no meeting--holiday)
Aug11ivan Inferring Locks for Atomic Sections
Aug18mihai Software Transactional Memory for Large Clusters
Aug25davidhan Program Optimization Space Pruning for a Multithreaded GPU

Spring 2008

Date Presenter Topic/Reading
Jan7 (no meeting)
Jan14livio Impact of Cache Coherence Protocols on the Processing of Network Traffic
Jan21jason Self-calibrating Online Wearout Detection
Jan28ivan Exploiting coarse-grained task, data, and pipeline parallelism in stream programs
Feb4 martin Multitasking Workload Scheduling on Flexible Core Chip Multiprocessors
Feb11chuck Lengthening Traces to Improve Opportunities for Dynamic Optimization (practice talk)
Feb18(no meeting)
Feb25mihai The Potential for Variable-Granularity Access Tracking for Optimistic Parallelism (practice talk)
Mar3 (no meeting)
Mar10jeremy Uncovering Hidden Loop Level Parallelism in Sequential Applications
Mar13YIANNAKIS SAZEIDES, UOF CYPRUSThe Significance of Affectors and Affectees Correlations for Branch Prediction
Mar17kaveh Performance-Aware Speculation Control using Wrong Path Usefulness Prediction
Mar20Visit IBM: 1:30pmBill Hay on Power6; Alan Adamson on minimizing power; Ioana Burcea on virtualizing predictors
Mar24(no meeting)
Mar25IBM special lecture: BA1200 10am-12pmMark Mendel on "single-source compiler for cell";
Allan Kielstra on "Testarosa JIT in a dynamic compiler for static languages"
Mar31davidtam Gaining Insights into Multi-Core Cache Partitioning: Bridging the Gap between Simulation and Real Systems
Apr7 martin FCCM practice talk: Scaling Soft Processor Systems
Apr144pm GB220: Khaled Z. Ibrahim (IRISA/INRIA) Parallel Computing from Specialty to Ubiquity
Apr21myrto Power-Efficient DRAM speculation
Apr28(no meeting)

Fall 2007

Date Presenter Topic/Reading
Sep5(no meeting)
Sep12(no meeting)
Sep19(no meeting)
Sep26(no meeting)
Sep28IBM Visit, 1:30pm SFB560
Oct3 ivan BulkSC: bulk enforcement of sequential consistency
Oct10jeremy A Study of a Transactional Parallel Routing Algorithm
Oct17chuck CAS CDP practice talk
Oct22CASCON: CDP workshop
Oct24(no meeting: CASCON!)
Oct31(no meeting)
Nov7 kaveh Matrix Scheduler Reloaded
Nov14ioana Emulating Optimal Replacement with a Shepherd Cache
Nov21davidtam MetaTM/TxLinux: Transactional Memory For An Operating System
Nov28myrto Virtual Hierarchies to Support Server Consolidation
Dec5 utku
Dec12borys Exterminator: Automatically Correcting Memory Errors with High Probability

Summer 2007

Date Presenter Topic/Reading
May2 jeremy Automatic Thread Extraction with Decoupled Software Pipelining
May9 ioana Fair Queuing Memory Systems
May16 jason (SFB560!)Architectural Implications of Brick and Mortar Silicon Manufacturing
May23 dario Line Distillation: Increasing Cache Capacity by Filtering Unused Words in Cache lines
May30 marek Masters thesis work
Jun6 davidtam Managing Shared L2 Caches on Multicore Systems in Software
Jun13 (no meeting) TBD
Jun20 (no meeting) TBD
Jun27 myrto Adaptive Insertion Policies for High-Performance Caching
Jul4 utku Understanding Tradeoffs in Software Transactional Memory
Jul11 borys Sensitivity Analysis for Automatic Parallelization on Multi-Cores
Jul18 jason Mechanisms for Store-wait-free Multiprocessors
Jul25 davor Raksha: A Flexible Information Flow Architecture for Software Security
Aug1 mihai An Effective Hybrid Transactional Memory System with Strong Isolation Guarantees
Aug8 martin Improving Pipelined Soft Processors with Multithreading
Aug15 (no meeting)TBD
Aug22 (no meeting)TBD
Aug29 (no meeting)TBD

Spring 2007

Date Presenter Topic/Reading
Jan17jason Improving Multiple-CMP Systems Using Token Coherence
Jan24chris pickett (mcgill)libspmt: A Library for Speculative Multithreading
Jan31reza Utility-Based Cache Partitioning: A Low-Overhead, High-Performance, Runtime Mechanism to Partition Shared Caches
Feb7 livio Rapidly Selecting Good Compiler Optimizations using Performance Counters
Feb14davidtam Managing Distributed, Shared L2 Caches through OS-Level Page Allocation
Feb21(no meeting, reading week)
Feb28Amir Roth, U Penn. (CANCELLED))SQIP to my (LS)Q: Rethinking Loads and Stores in Out-of-Order Microarchitectures
Mar7 myrto LogTM-SE: Decoupling Hardware Transactional Memory from Caches
Mar9 IBM visit (at IBM)Robert Enenkel (IBM) on optimizing math libraries
Marek (UofT): "Dynamic x86 rewriting and some applications"
Mar14borys Exploiting Postdominance for Speculative Parallelization
Mar20Allan Kielstra and Marcel Mitran (IBM) 10-12pm in BA1200Compiling in the Real World, IBM Compiler Technologies, and Job Opportunities
Mar21utku Implicit Parallelism with Ordered Transactions
Mar28ivan Extending Multicore Architectures to Exploit Hybrid Parallelism in Single-Thread Applications
Apr4 davor Diverge-Merge Processor (DMP)
Apr11mihai Tight analysis of the performance potential of thread speculation using spec CPU 2006
Apr18(no meeting) TBA
Apr25marek "Code Generation and Optimization for Transactional Memory Constructs in an Unmanaged Language

Fall 2006

Date Presenter Topic/Reading
Sep20davor Architectural Semantics for Practical Transactional Memory
and The Atomos Transactional Programming Language
Sep27ivan POSH: a TLS compiler that exploits program structure
Oct4 utku Constructing Virtual Architectures on a Tiled Processor
Oct11patrick Ultra Low-Cost Defect Protection for Microprocessor Pipelines
Oct18(no meeting---CASCON)TBA
Oct25borys A Decoupled KILO-Instruction Processor
Nov1 chuck Compiler and runtime support for efficient software transactional memory
Nov8 greg Tolerating Dependences Between Large Speculative Threads Via Sub-Threads
Nov15mihai Hybrid Transactional Memory
Nov22marek Software-based instruction caching for embedded processors
Nov29martin Custom Code Generation for Soft Processors
Dec6 ioana A Case for MLP-Aware Cache Replacement
Dec13(no meeting)

Summer 2006

Date Presenter Topic/Reading
May3 (no meeting)
May10ivan CGO Review
May17utku PPoPP Review (meet in SFB560!)
May24patrick BranchTap: Improving Performance with Very Few Checkpoints Through Adaptive Speculation Control.
May31marek Program Demultiplexing
Jun7 borys Profiling over Adaptive Ranges
Jun14(no meeting)
Jun21(no meeting)
Jun28chuck WTM and TRANSACT workshop reviews
Jul5 mihai McRT-STM
Jul12ioana & jason ISCA06 review
Jul19davidtamCooperative Caching for Chip Multiprocessors
Jul26martin Bulk disambiguation of speculative threads
Aug2 (no meeting)
Aug9 livio TMA: A Trap-Based Memory Architecture
Aug15IBM meeting (at IBM)David Tam, Reza Azimi, Alan Adamson (IBM)
Aug16(no meeting) TBA
Aug23reza Predicting Inter-Thread Cache Contention on a CMP

Spring 2006

Date Presenter Topic/Reading
Jan18utku Transactional execution of java programs
Jan25mihai A Qualitative Survey of Modern Software Transactional Memory Systems
Feb1 ioana LogTM: Log-based Transactional Memory
Feb8 peter Application-Specific Customization of Soft Processor Microarchitecture
Feb15(no meeting)TBA
Feb22(no meeting)TBA
Mar1 davor A High Throughput String Matching Architecture for Intrusion Detection and Prevention
Mar8 jason ReSlice: Selective Re-Execution of Long-Retired Misspeculated Instructions Using Forward Slicing
Mar15davidtam MESA: Reducing Cache Conflicts by Integrating Static and Run-Time Methods
Mar231:30pm IBM (at UofT) TBA
Mar29Andy Tanenbaum, SF1105MINIX 3
Mar31Babak Falsafi, 10am SFB560TRUSS: Reliable, Scalable Server Architecture
Apr5 kirk Stream Programming on General-Purpose Processors
Apr12martin Automatic Multithreading and Multiprocessing of C Programs for IXP
Apr19reza Optimizing Replication, Communication, and Capacity Allocation in CMPs
Apr26(no meeting)TBA

Fall 2005

Date Presenter Topic/Reading
Dec13 Josep Torellas (at 11am, BA1190) TBA
Dec6reza Garbage Collection Without Paging
Nov29chuck Characterization of TCC on Chip-Multiprocessors
Nov22borys The V-Way Cache: Demand Based Associativity via Global Replacement
Nov15(no meeting)
Nov8 kirk Compiling for Cell
Nov1 andreas RegionScout
Oct25mihai Threads cannot be implemented as library
Oct18no meet (cascon)
Oct11(a)Ben Alternative Dispatch Techniques for the Tcl VM
Oct11(b)Mattthew Mixed Mode Execution with Context Threading
Oct4 davidtam Maximizing CMP Throughput with Mediocre Cores
Sep27jason Transparent Instruction Set Customization
Sep20martin How to Fake 1000 Registers
Sep13(IBM, no meeting)

Summer 2005

Date Presenter Topic Reading
Aug30 greg Optimistic Intra-transaction Parallelism1
Aug23 ivan Software watermarking1
Aug16 utku Compiler Optimization Space1
Aug9 stanley Temporal Streaming1
Aug2 jeff Pool Allocation1
Jul26 chuck Programming with Transactions1
Jul19 ioana Transition Phase Classification/Prediction1
Jul12 davidtam Multicore interconnects1
Jul5 borys TBA
Jun28 (no meeting) TBA
Jun21 (no meeting) TBA
Jun14 petery Soft Processors1
Jun7 reza AccMon1
May31 kirk Superword Parallelism1
May24 No meeting TBA
May19 IBM (UofT) TBA
May17 No meeting TBA
May10 No meeting TBA
May3 levon Tracing Garbage Collection1

Spring 2005

Date Presenter Topic Reading
Apr26 ivan TBAPower Optimization for MLCA
Apr19 stanley TLS cache locality
Apr12 jeff Probabilistic Pointer Analysis1
Apr5 chuck Software TLS
Mar29 ioana Region-Based Compilation for Java JIT1
Mar22 No meeting (IBM)TBA
Mar15 matt Context threading1
Mar8 reza Online analysis with Perf Counters1
Mar1 david tam SMT and CMP OS Power Management1
Feb22 gokhan memik Clumsy Packet Processorsabstract
Feb15 (reading week)TBA
Feb8 kirk Automatic Task Formation1
Feb1 martin 10GB NIC1
Jan25 petery SPREE1
Jan18 greg Software TLS1

Fall 2004

Date Topic Presenter Reading
dec6 Ispike Utku 1
nov29 Data Race Detection Borys 1,2
nov22 Compiler Algorithm for Energy Reduction Ivan 1
nov15 Inlining Java Native Calls levon abstract
nov8 Destination-Set Prediction stanley 1
nov1 Transactional Coherence chuck 1
oct25 Jumbo: runtime code gen for Java mihai 1,2
oct18 ASPLOS report ASPLOS attendees ASPLOS site
Oct 4 Neural Branch Prediction Ioana 1
Sep 20 Pointer Analysis Jeff 1

Summer 2004

Date Topic Presenter Reading
August 31 No meeting ?
August 24 Power awareness via dynamic optimization Ioana 1
August 17 First-Order Superscalar Processor Model Reza
August 10 No meeting
August 3 No meeting TBA
July 27 CMPs with Execution Migration David Tam 1
July 20 Decomposition for TLS Kirk 1
July 13 Decompositions for Network Processing Martin (1,2)
July 6 No Meeting TBA
June 29 Compilation for TLS Utku TBA
June 22 No meeting
June 15 Garbage Collection Performance Borys 1
June 8 Half-Price Architecture Peter 1
June 4 IBM visit TBA
June 1 No meeting
May 25 CMP L2 Organizations Greg 1
May 18 Vector Threads Andrew 1
May 11 SMT P4 Observations Ioana 1
May 4 Speculating to reduce unnecessary power consumption Ivan 1
Apr27 Lifelong Program Analysis Levon 1

Spring 2004

Date Topic Presenter Reading
Apr20 Reusing Computation Results Derek 1
Apr13 IBM visits Toronto
Apr6 Signature Buffer Stanley 1
Mar30 Global History Buffer Chuck 1
Mar23Static Identification of Delinquent LoadsMihai1
Mar16Interaction CostsJeff D.1
Mar2 Optimizing Indirect Branch Prediction Accuracy in Virtual Machine Interpreters Kirk 1
Feb24 Meta Optimization Utku 1
Feb3 Scheduling for Custom Data Paths Martin 1
Feb6 Dynamic Profiling: (visit to IBM);
More info on "Hardware Counters" by David Tam
Various N/A
Feb10 Single-ISA Heterogeneous Multi-Core Architectures: The Potential for Processor Power Reduction Borys 1

Fall 2003

Date Topic Presenter Abstract Reading For Discussion
Oct 1: "Self Optimizing Libraries" Hamza Karamali abstract 1,2 Hardware vs Software
Oct 8: CASCON Compiler workshop
Oct 15: "Exploiting ILP, TLP, and DLP with the Polymorphous TRIPS Architecture" Greg Steffan abstract 1, (additional info) Wireless Inter-chip Communication: 1, 2
Oct 29: "XOM and XOMOS" David Lie abstract 1 2 Open source
Nov 5: "Hardware Support for Prescient Instruction Prefetch" Tor Aamodt N/A economics of distributed computing
Dec 3: "Introduction to OpenMP" Mike Voss OpenMP
Dec 16: Special meeting with IBM to discuss SMT, the Power 5, and current compiler research at UofT

History of this Site