Matrix Dimension

C++

OpenMP

CUDA

16

0.112s

0.109s

3.785s

32

0.662s

0.582s

6.556s

256

1691.884s

566.767s

128.19s