## File(s) under permanent embargo

# Hierarchical probing for estimating the trace of the matrix inverse on toroidal lattices

conference contribution

posted on 2013-01-01, 00:00 authored by A Stathopoulos, Jesse LaeuchliJesse Laeuchli, K OrginosThe standard approach for computing the trace of the inverse of a very large, sparse matrix A is to estimate it through the Monte Carlo algorithm as the mean value of matrix quadratures. This is heavily used in our motivating application of Lattice QCD. Often, the elements of A-1 display certain decay properties away from the nonzero structure of A, but random vectors cannot exploit this induced structure of A-1. Probing is a technique that discovers elements of a matrix A through matrix-vector multiplications by using vectors tailored to the sparsity pattern of the matrix. In the case of A-1, and when the above decay properties are present, the pattern is obtained by distance-k coloring of the graph of A, in which no nodes within distance k of each other have the same color. For sufficiently large k, the method produces accurate trace estimates but it becomes expensive and in some cases prohibitive. More importantly, it is difficult to search for an optimal k value, since none of the work for prior choices of k can be reused. First, we introduce the idea of hierarchical coloring that produces distance-2i colorings for a sequence of distances 2i up to the diameter of the graph. To achieve this, at each level, i, we compute the distance-1 coloring independently for each of the node-groups associated with a color of the distance-(2i-1) coloring. For uniform, toroidal lattices, this idea leads to a highly efficient and naturally parallel algorithm to produce the hierarchical permutation. Second, we provide an algorithm for choosing a corresponding hierarchical sequence of orthogonal probing vectors, which allows us to increase the number of probing vectors until the required accuracy is achieved. Several experiments show that when a decay structure exists, our algorithm finds it and approximates the trace incrementally, starting with the most important contributions. We have observed up to an order of magnitude speedup over the standard Monte Carlo.