Michael Stumm: Alumni

Ph.D. Alumni: Dattatraya Kulkarni

Reference:

Dattatraya Kulkarni
CDA: Computation Decomposition and Alignment
Ph.D. Thesis, Department of Computer Science, University of Toronto, Toronto, Canada, 1997.

Supervisor(s):

Michael Stumm

Download Thesis:

PDF

Abstract:

Restructuring compilers have been effective in tailoring nested loops and arrays so as to improve performance on both uniprocessor and multiprocessor systems. The regular structure of nested loops and arrays has enabled their systematic analysis and transformation. The focus of this dissertation is a new and generalized loop transformation framework, called Computation Decomposition and Alignment (CDA).

The linear loop transformation framework, introduced in 1990, was a major breakthrough partly because it provided a unified view of many of the earlier loop transformations, and partly because it was a formal method based on linear algebra. Since then, the compiler community has designed algorithms which automatically derive linear transformations that achieve specific optimization objectives in given nested loops. The framework also sparked the development of generic techniques to derive efficient code for the linearly transformed loop structure.

The main contribution of this dissertation is the CDA transformation framework, which is capable of restructuring nested loops at the granularity of statements and subexpressions. The granularity of transformation is thus finer than in the linear loop transformation framework, which transforms nested loops at the granularity of entire iterations. A CDA transformation is applied to a nested loop in two steps. First, the loop iteration space is decomposed into multiple computation spaces, each representing computations of a statement or subexpression. Second, each of the computation spaces is linearly transformed with a (possibly) different transformation matrix.

CDA unifies into a single framework many existing transformations, including all linear loop transformations. A linear loop transformation only modifies the execution order of the iterations, while a CDA transformation modifies both the composition of the iterations and the execution order of the re-composed iterations. This feature enables new optimizations which cannot be obtained by linear transformations alone. In this dissertation, we show how CDA transformations can achieve the effect of certain global transformations. We present heuristic algorithms to automatically derive CDA transformations to reduce i) the number of cache conflicts and ii) the number of ownership test, and we show how CDA can achieve several other optimizations. We also compare the performance of some benchmark loops to the corresponding CDA transformed loops using a simulator and three different types of real computer systems.

Keywords:

Compiler optimization, loop transformations, performance optimization, linear loop transformation, framework, sub-expression transformation

BibTeX:

@phdthesis(Kulkarni-PhD97,
    author = {Dattatraya Kulkarni},
    title = {CDA: Computation Decomposition and Alignment},
    school = {Department of Computer Science, University of Toronto},
    address = {Toronto, Canada},
    supervisors = {Michael Stumm},
    year = {1997},
    keywords = {Compiler optimization, loop transformations, performance optimization, linear loop transformation, framework, sub-expression transformation}
)