I/O Efficient Algorithms for Matrix Computations

No Thumbnail Available
Journal Title
Journal ISSN
Volume Title
Algorithms for large data sets, unlike in-core algorithms, have to keep the bulk of their data in the secondary memory, which typically is much slower than the main memory. In designing these out-of-core algorithms, the goal is therefore to minimise the number of I/Os executed. The literature is rich in efficient out-of-core algorithms for matrix computation. But very few of them are designed on the external memory model of Aggarwal and Vitter, and as such attempt to quantify their performances in terms of the number of I/Os performed. This thesis makes some contributions in that direction. We analyse some QR decomposition algorithms, and show that the I/O complexity of the tile based algorithm is asymptotically the same as that of matrix multiplication. This algorithm, we show, performs the best when the tile size is chosen so that exactly one tile fits in the main memory. We propose a constant factor improvement, as well as a new recursive cache oblivious algorithm with the same asymptotic I/O complexity. The traditional unblocked and blocked Hessenberg, tridiagonal, and bidiagonal reductions are not I/O efficient because vector-matrix operations dominate their performances. We design Hessenberg, tridiagonal, and bidiagonal reductions that use banded intermediate forms, and perform only asymptotically optimal numbers of I/Os; these are the first I/O optimal algorithms for these problems. In particular, we show that known slab based algorithms for two sided reductions all have suboptimal asymptotic I/O performances, even though they have been reported to do better than the traditional algorithms on the basis of empirical evidence. We propose new tile based variants of multishift QR and QZ algorithms that under certain conditions on the number of shifts, have better seek and I/O complexities than all known variants. We show that techniques like rescheduling of computational steps, appropriate choosing of the blocking parameters and incorporating of more matrix-matrix operations, can be used to improve the I/O and seek complexities of matrix computations...
Supervisor: G. Sajith