Improving the Performance of Morton Layout by Array Alignment and Loop Unrolling
Jeyarajan Thiyagalingam, Olav Beckmann, Paul H. J. Kelly
To appear at
16th Workshop on Languages and Compilers for Parallel Computing (LCPC03), College Station, TX, 2-4 October 2003
Full Text, Printable Abstract.
Abstract
Hierarchically-blocked non-linear storage layouts, such as the
Morton ordering, have been proposed as a compromise between
row-major and column-major for two-dimensional arrays. Morton
layout offers some spatial locality whether traversed row-wise or
column-wise. The goal of this paper is to make this an attractive
compromise, offering close to the performance of row-major traversal
of row-major layout, while avoiding the pathological behaviour of
column-major traversal. This paper explores how spatial locality of
Morton layout depends on the alignment of the array's base address,
and also how unrolling has to be aligned to reduce address
calculation overhead. We conclude with extensive experimental
results using five common processors and a small suite of simple
benchmark kernels.