Improving the Performance of Morton Layout by Array Alignment and Loop Unrolling

Jeyarajan Thiyagalingam, Olav Beckmann, Paul H. J. Kelly

To appear at 16th Workshop on Languages and Compilers for Parallel Computing (LCPC03), College Station, TX, 2-4 October 2003

Full Text, Printable Abstract.


Abstract

Hierarchically-blocked non-linear storage layouts, such as the Morton ordering, have been proposed as a compromise between row-major and column-major for two-dimensional arrays. Morton layout offers some spatial locality whether traversed row-wise or column-wise. The goal of this paper is to make this an attractive compromise, offering close to the performance of row-major traversal of row-major layout, while avoiding the pathological behaviour of column-major traversal. This paper explores how spatial locality of Morton layout depends on the alignment of the array's base address, and also how unrolling has to be aligned to reduce address calculation overhead. We conclude with extensive experimental results using five common processors and a small suite of simple benchmark kernels.