Home > Blockchain >  How to efficiently create a heatmap of all value pair in two vectors?
How to efficiently create a heatmap of all value pair in two vectors?

Time:06-22

Problem

In MATLAB I have 2 1-D array of the same size with integer values. Considering that the values at a same index are a pair, I want to count how many pairs of each I have and store that in a matrix to later print it a a heatmap.

Here is a small example of what I want to archive:

v1 = [1, 0, 1, 0, 1];
v2 = [1, 1, 0, 1, 1];

% Magic happens here

heatMatrix = [0, 2;
              1, 2];

The result is as follow because we have:

  • 0 time the pair v1=0, v2=0 (heatMatrix[1, 1])
  • 2 times the pair v1=0, v2=1 (indices 2 and 4, heatMatrix[1, 2])
  • 1 time the pair v1=1, v2=0 (index 3, heatMatrix[2, 1])
  • 2 time the pair v1=1, v2=1 (indices 1 and 5, heatMatrix[2, 2])

My code

I have done that using MATLAB loops which is slower than reasonable especially for long vectors:

% Alocating random vectors
rng(0);
maxVal = 255;
nbElem = 1e6;
v1 = randi(nbElem, 1);
v2 = randi(nbElem, 1);
heatMatrix = zeros(maxVal, maxVal);

% Creating heat matrix
tic();
for ii = 1:maxVal
  for jj = 1:maxVal
    heatMatrix(ii, jj) = sum((v1 == i) & (v2 == j));
  end
end
toc();

Performance issues

I have to deal with long vectors, in the realm of 1e7 elements. I am working with an Intel(R) Core(TM) i7-6700 CPU @ 3.40GHz and have 7.5 GB of RAM. And my performances are as follow:

enter image description here

Question

I am looking for a way to speed up the execution of this task. I'm mostly looking for a way to vector or perform the task in a smarter way (maybe there is a MATLAB function I'm not aware of).

CodePudding user response:

A simple way to do this is to:

  • Realize the value pairs in v1 and v2 are the indexes in your heatMatrix and transform them to linear indexes

  • Use unique to get the unique indexes, as well as the corresponding indexes in the input linear indexes vector

  • Calculate the repetitions of each index in ix with accumarray

  • Use the unique_ind indexes to fill the matrix

%

% Alocating random vectors
rng(0);
maxVal = 255;
nbElem = 1e6;
v1 = randi(maxVal,nbElem,1);
v2 = randi(maxVal,nbElem,1);
heatMatrix = zeros(maxVal, maxVal);
heatMatrix2 = zeros(maxVal, maxVal);

% Creating heat matrix
tic();
for ii = 1:maxVal
  for jj = 1:maxVal
    heatMatrix(ii, jj) = sum((v1 == ii) & (v2 == jj));
  end
end
toc();

% faster
tic();
ind = sub2ind([maxVal maxVal],v1,v2);
[unique_ind,~,ix] = unique(ind);
C = accumarray(ix,1).';
heatMatrix2(unique_ind) = C;
toc();

isequal(heatmatrix,heatmatrix2)

%%%% Elapsed time is 24.850305 seconds.
%%%% Elapsed time is 0.049509 seconds.
%%%% 1

If, as I suspect, your values actually range from 0 to maxVal:

% Alocating random vectors
rng(0);
maxVal = 255;
nbElem = 1e6;
v1 = randi(maxVal 1,nbElem,1)-1;
v2 = randi(maxVal 1,nbElem,1)-1;
heatMatrix = zeros(maxVal 1, maxVal 1);
heatMatrix2 = zeros(maxVal 1, maxVal 1);

% Creating heat matrix
tic();
for ii = 1:(maxVal 1)
  for jj = 1:(maxVal 1)
    heatMatrix(ii, jj) = sum((v1 == (ii-1)) & (v2 == (jj-1)));
  end
end
toc();

% faster
tic();
ind = sub2ind([maxVal 1 maxVal 1],v1 1,v2 1);
[unique_ind,~,ix] = unique(ind);
C = accumarray(ix,1).';
heatMatrix2(unique_ind) = C;
toc();

isequal(heatMatrix,heatMatrix2)

%%%% Elapsed time is 24.748371 seconds.
%%%% Elapsed time is 0.055760 seconds.
%%%% 1

CodePudding user response:

Option 1: accumarray()

accumarray() can handle a 2D index, so:

v1 = [1, 0, 1, 0, 1];
v2 = [1, 1, 0, 1, 1];

hm = accumarray([v1;v2].' 1,1)

% hm = [0, 2;
%       1, 2];

Just make sure that your index start with 1.


Option 2: sparse()

sparse() can handle duplicate index, so:

v1 = [1, 0, 1, 0, 1];
v2 = [1, 1, 0, 1, 1];

hm = full(sparse(v1 1,v2 1,1))

% hm = [0, 2;
%       1, 2];

Both options are way faster than your solution and faster than the @BillBokeey's solution. But it looks like accumarray is a bit faster.

  • Related