Presentation Title: Tiling Optimization For Hand Written Code
Committee:
- Dr.Gerald Baumgartner(Chair)
- Dr.J.Ramanujam
- Dr.Rahul shah
Date: November 10, 2008
Time: 11:00 AM
Location: Coates Hall,Room 256
Abstract:
In the world that demands for heavy simulation and visualization there is a need to develop applications that can perform heavy computations with a given hardware and generate accurate results. Most of these computations have huge data represented in form of multidimensional matrices that need to be processed in short amount of time with most of the operations performed being summations and multiplications. One such application is provided by The Tensor Contraction Engine (TCE) which is a compiler that translates high-level, mathematical tensor contraction expressions into efficient, parallel Fortran code. A pair of optimizations in the TCE, the fusion and tiling optimizations, has proven successful for minimizing memory requirements and disk-to-memory traffic for dense tensor computations. These search-based optimization algorithms can be used in tandem for optimizing handwritten dense array computations to minimize disk to memory traffic. In this project, we show how to apply the tiling algorithm to handwritten code in a procedural language and how it is implemented in ROSE for an input code which is assumed to be fused.
While in the TCE the loop fusion and tiling algorithm operated on high-level expression trees, in a standard compiler it needs to operate on abstract syntax trees. For simplicity, we use the tiling algorithm only for minimizing disk-to-memory traffic for a given memory size. Also, we limit ourselves to handwritten, dense array computations in which loop bounds expressions are constant, subscript expressions are simple loop variables, and there are no common sub-expressions. We also assume that there are no dependencies between different expressions.
After analyzing the input code which is assumed to be processed by the loop fusion algorithm and all the loops are perfectly nested; it is converted to an intermediate representation by SAGE III. We record all the loops, loop variables and the loop ranges. The loops are later tiled and the intra-tiled loops are propagated down to the body and inserted into the code with appropriate tile size which will wrap around the statements in the body of the loops. Finally, the modified abstract syntax tree is written to a new file which contains the optimized code.
All are invited.