RDGC: A Reuse Distance-Based Approach to GPU Cache Performance Analysis

keywords: GPU cache memory, reuse distance analysis, performance modeling, hit ratio
In the present paper, we propose RDGC, a reuse distance-based performance analysis approach for GPU cache hierarchy. RDGC models the thread-level parallelism in GPUs to generate appropriate cache reference sequence. Further, reuse distance analysis is extended to model the multi-partition/multi-port parallel caches and employed by RDGC to analyze GPU cache memories. RDGC can be utilized for architectural space exploration and parallel application development through providing hit ratios and transaction counts. The results of the present study demonstrate that the proposed model has an average error of 3.72 % and 4.5 % (for L1 and L2 hit ratios, respectively). The results also indicate that the slowdown of RDGC is equal to 47 000 times compared to hardware execution, while it is 59 times faster than GPGPU-Sim simulator.
mathematics subject classification 2000: 68M20
reference: Vol. 38, 2019, No. 2, pp. 421–453