I have a row vector q
with 200
elements, and another row vector, dij
, which is the output of the pdist
function with currently 48216200
elements, but I'd like to be able to go higher. The operation I want to do is essentially:
t=sum(q'*dij,2);
However, since this tries to allocate a 200x48211290
array, it complains that this would require 70GB of memory. Therefore I do it this way:
t = zeros(numel(q),1);
for i=1:numel(q)
qi = q(i);
factor = qi*dij;
t(i)=sum(factor);
end
However, this takes too much time. By too much time, I mean it takes about 36s
, which is orders of magnitude longer than the time required by the pdist
function. Is there a way I can speed up this operation without explicitly allocating so much memory? I'm assuming here, that if the first way could allocate the memory, (being a vector operation) it would be faster.
CodePudding user response:
Just use the distributive property of multiplication with respect to addition:
t = q'*sum(dij);
CodePudding user response:
for testing what Cris said in the first post comment I created 3 ".m" files as follows:
vec.m :
res=sum(sin(d.*q')./(d.*q'));
forloop.m
for i=1:200
res(i)=sum(sin(d.*q(i))./(d.*q(i)));
end
and test.m:
clc
clear all
d=rand(4e6,1);
q=rand(200,1);
res=zeros(1,200);
forloop;
vec;
forloop;
vec;
forloop;
vec;
then I used matlab run and time profiler , the results were very surprising ! :
3 calls to forloop : ~10.5 S
3 call to vec : 15.5 S !!!
and additionally when I converted data to single the results were:
... forloop : 7.5 S
... vec : 8.5 S
I don't know precisely why for-loop is faster in these scenarios, but as for your problem, you could speed up things by generating lesser variables in the loop and using vertical vectors( i think). and finally converting your data to single values :
q=single(rand(200,1));
...