Parallel Tiled Code Generation with Loop Permutation within Tiles

Marek Palkowski

Faculty of Computer Science West Pomeranian University of Technology Zolnierska 49, 70-210 Szczecin, Poland
Wlodzimierz Bielecki

Faculty of Computer Science West Pomeranian University of Technology Zolnierska 49, 70-210 Szczecin, Poland

Parallel Tiled Code Generation with Loop Permutation within Tiles

keywords: Optimizing compilers, tiling, loop permutation, transitive closure, dependence graph, code locality, automatic parallelization

An approach of generation of tiled code with an arbitrary order of loops within tiles is presented. It is based on the transitive closure of the program dependence graph and derived via a combination of the Polyhedral and Iteration Space Slicing frameworks. The approach is explained by means of a working example. Details of an implementation of the approach in the TRACO compiler are outlined. Increasing tiled program performance due to loop permutation within tiles is illustrated on real-life programs from the NAS Parallel Benchmark suite. An analysis of speed-up and scalability of parallel tiled code with loop permutation is presented.

mathematics subject classification 2000: 68N20, 65Y05, 52Bxx, 97E60, 05-XX

reference: Vol. 36, 2017, No. 6, pp. 1261–1282

doi: 10.4149/cai_2017_6_1261

Computing and Informatics

formerly Computers and Artificial Intelligence

Parallel Tiled Code Generation with Loop Permutation within Tiles