Speed Up Svd In Pytorch
Solution 1:
Batched calculation
Assuming you have PyTorch >= 1.2.0 then batched SVD is supported so you can use
U, _, V = torch.svd(batch)
S = U[:, :, :, 0].unsqueeze(3) @ V[:, :, :, 0].unsqueeze(2)
which I found to be a little faster on average than the iterative version.
Truncated SVD (CPU only)
If you don't have cuda acceleration you could use truncated SVD to avoid computing the unnecessary singular values/vectors. Unfortunately PyTorch doesn't support truncated SVD and AFAIK there's no batched or GPU version available. There are two options I'm aware of
Both of these allow you to choose the number of components to return. In OP's original question we only want the first component.
Even though I'm not using it on sparse matrices I found svds
with k=1
to be about 10x faster than torch.svd
on CPU tensors. I found that randomized_svd
was only about 2x faster. Your results will depend on the actual data. Also, svds
should be a little more accurate than randomized_svd
. Keep in mind there are going to be small differences between these results and the torch.svd
results, but they should be negligible.
import scipy.sparse.linalg as sp
import numpy as np
S = torch.zeros((batch_size, C, H, W))
for i inrange(batch_size):
img = batch[i, :, :, :]
for c inrange(C):
u, _, v = sp.svds(img[c], k=1)
S[i, c] = torch.from_numpy(np.outer(u, v))
Solution 2:
PyTorch now has speed optimised Linear Algebra operations analogous to numpy's linalg
module, including torch.linalg.svd
:
The implementation of SVD on CPU uses the LAPACK routine ?gesdd (a divide-and-conquer algorithm) instead of ?gesvd for speed. Analogously, the SVD on GPU uses the cuSOLVER routines gesvdj and gesvdjBatched on CUDA 10.1.243 and later, and uses the MAGMA routine gesdd on earlier versions of CUDA.
Post a Comment for "Speed Up Svd In Pytorch"