Home > OS >  Create and fill new array with data, if the current array extend its size of certain columns
Create and fill new array with data, if the current array extend its size of certain columns

Time:08-27

in advance: i apologize for this may stupid and easy question. heres my problem: im currently working with huge data packages and have to extract specific values into an array. The code (which i havent written) is working, but it happens, that matlab slows down really hard, when the array is reaching a certain size. My first idea is to segment the output into various arrays. I hope with that i can avoid any slow down. Also an save of the array and kick out of the workspace is possible for me.

But i struggle when it comes to tell the script to switch the array, which has to fill. here my code i come so far:


 if counter ~= 18982 && rem(counter,18983) ~= 0
 array(counter,:) = [x y z x1 x2 x3 x4 x5 x6 x7 x8 x9];

 elseif counter == 18983 || rem(counter,18983) == 0
 i=i 1
 name_str = ['array',int2str(i),'= 1'];
 eval(name_str)

my elseif will create a new array named as it should. But i have no idea how to adress it and manipulate. Im also worried about bringing this new array as a variable into code line two, because the first "if iteration" should work again. May you can give me some suggestions how to solve this problem. Im happy about any suggestions or help, also if a total different approach than mine is the right one.

Info to computer properties: this script will run on a cluster node, so 128 cores and 500GB Ram are available.

Thanks alot! :)

CodePudding user response:

This is most likely an particularly awful incident of not pre-allocating your arrays.

A good maxim in Matlab: when filling large arrays, always pre-allocate (well, as much as possible)

As a motivational example, consider the following examples:

First, this code block seems to be about what you are doing. (I cut it off at 1e5 because I'm not very patient). The x variable is growing at each iteration. Under the hood, Matlab is copying data all the data into new memory location at each iteration.

clear x
tic
for ix = 1:1e5
    x(ix,[1:12]) = (1:12)*ix;
end
toc   .9 seconds

This is what happens after pre-allocating. Note that all I changed is to first set my output (y in this case) to a large array of zeros first, and then I fill it.

clear y
tic
y = zeros(1e5,12);
for ix = 1:1e5
    y(ix,[1:12]) = (1:12)*ix;
end
toc  %0.083 seconds

This difference probably grows with n^2, so by the time you get to 1e8 entries, you will be crawling.

Applying this to your code, try adding the following line before you start the loop: array = zeros(n, 12), where n if your best guess of how large the array needs to be, rounded up. (If you want to get fancier, you can look at methods to intelligently grow an array of unknown size. But that's a topic for another question.)

You probably (almost certainly) don't need to break your data up. I can fit a 1e8 x 12 array into memory just fine, and my computer is not as beefy as yours.


Follow-up:

Filling this out to a more realistic size, I get the following timing information at 1e8 lines

clear z
tic
z = zeros(1e8,12);
for ix = 1:1e8
    z(ix,[1:12]) = (1:12)*ix;
end
toc  I seconds

CodePudding user response:

I don't understand what you are trying to do, but I believe you can solve this using cells.

You want "3 new arrays with max (18983 x 12)".

for i = 1:3
    temp_array_based_on_calculations_and_conditions = % I don't know what you need to do here
    new_array{i} = temp_array_based_on_calculations_and_conditions;
end

Obviosly do whatever you need to do inside the loop and assign it to a cell. You can access each of these three new arrays using curly brackets {}.

  • Related