Home > Enterprise >  How do I split a very large matrix into submatrices based on the value in a column?
How do I split a very large matrix into submatrices based on the value in a column?

Time:02-02

I have a 5 x 600,000 matrix. I've had an idea to group the data so I want to group this matrix into submatrices based on the values in column 4.

For values between 0 and 500, I want one matrix, for values between 501 and 1000 I want another, and for values between 1001 and 1500 I want another.

How can I do this?

I currently don't have any reliable material, I have seen some examples online but they only seem to feature 2 variables (i.e. with value 1 or 0 in a column and grouping the 1s and the 0s into 2 submatrices).

CodePudding user response:

I think in Matlab-speak you mean you have an nxm matrix where n=600000, m=5, but if not you can change accordingly.

Is this what you were looking to do?

n=600000; 
m=5;
thisCol =4;

values_range = {[0,500];[501,1000];[1001,1500]};  % cell array of vectors
myMatrix = zeros(n,m);
myMatrix(:,thisCol) = 1:600000;  % to prove it works.

theseSubMatrices = cell(length(values_range),1); % cell array of matrices
for j=1:length(values_range)
    thisLow= values_range{j}(1);
    thisHigh= values_range{j}(2);
    theseSubMatrices{j} = myMatrix(myMatrix(:,thisCol)>=thisLow & myMatrix(:,thisCol)<=thisHigh,:);
end

CodePudding user response:

If you have some data

arr = rand( 6e5, 5 ); % 5 columns / 600,000 rows
arr(:,5) = arr(:,5) .* 1500; % for this example, get column 5 into range [0,1500]

Then you can use histcounts to "bin" the 5th column according to your edges.

edges = [0, 500, 1000, 1500]; % edges to split column 5 by
[~,~,iSubArr] = histcounts( arr(:,5), edges ); 

And generate a cell array with one element per sub array

nSubArr = numel(edges)-1; % number of bins / subarrays
subArrs = arrayfun( @(x) arr( iSubArr == x, : ), 1:nSubArr, 'uni', 0 ); % Get a matrix per bin

Output:

subArrs =
  1×3 cell array
    {200521×5 double}    {199924×5 double}    {199555×5 double}
  • Related