Vectorizing Outer Loop Of Euclidean Distance Using Numpy On Multi-dimensional Data
I have a 2D matrix of values. Each row is a data point. data = np.array( [[2, 2, 3], [4, 2, 4], [1, 1, 4]]) Now if my test point is a single 1D numpy array like: test =
Solution 1:
use broadcasting to do that :
from numpy.linalg import norm
norm(data-test[:,None],axis=2)
for
[ 1. 2.44948974 2.44948974]
[ 2.44948974 2.23606798 3.60555128]
Some explanations. It is easier to understand with different shapes, four and two points for exemple:
ens1 = np.array(
[[2, 2, 3],
[4, 2, 4],
[1, 1, 4],
[2, 4, 5]])
ens2 = np.array([[2,3,3],
[4,1,2]])
In [16]: ens1.shape
Out[16]: (4, 3)
In [17]: ens2.shape
Out[17]: (2, 3)
Then :
In [21]: ens2[:,None].shape
Out[21]: (2, 1, 3)
add a new dimension. now we can make the 2X4= 8 subtractions :
In [22]: (ens1-ens2[:,None]).shape
Out[22]: (2, 4, 3)
and take the norm along last axis, for 8 distances :
In [23]: norm(ens1-ens2[:,None],axis=2)
Out[23]:
array([[ 1. , 2.44948974, 2.44948974, 2.23606798],
[ 2.44948974, 2.23606798, 3.60555128, 4.69041576]])
Solution 2:
What about np.meshgrid
?
import numpy as np
data = np.array(
[[2, 2, 3],
[4, 2, 4],
[1, 1, 4]])
test = np.array([[2,3,3],
[4,1,2]])
d = np.arange(0,3)
t = np.arange(0,2)
d, t = np.meshgrid(d, t)
# print test[t]
# print data[d]
print np.sqrt(np.sum((test[t]-data[d])**2,axis=2))
output:
[[ 1. 2.44948974 2.44948974]
[ 2.44948974 2.23606798 3.60555128]]
Solution 3:
You could use a list comprehension:
result = np.array([np.sqrt(np.sum((t - data)**2, axis=1)) for t in test])
Post a Comment for "Vectorizing Outer Loop Of Euclidean Distance Using Numpy On Multi-dimensional Data"