Home > database >  Separating and splitting table by numeric values and writing to multiple files
Separating and splitting table by numeric values and writing to multiple files

Time:08-30

I have a single CSV with data uploaded from multiple IOT sensors (all within the same file), each device's row distinguished by a unique numeric ID. I convert this into a MATLAB table using the readtable() method. I can then organize the file by grouping device ID's separately by using the sortrows() method on the Device ID column.

However, how would I split the grouped devices into seperate tables? Currently I use the following algorithm:

for g = 1:numDevices %create a new file for each device type
    outputFileIndoorData1 = sprintf("C:\Users\Documents\duetTestFile%d",g);
    for h = 1:height(indoorDataSorted) %Iterate through entire CSV, splitting by device ID
        if strcmp(indoorDataSorted.device_id(h),deviceIDs(g,1)) 
            writelines(indoorDataSorted(h,1:17),outputFileIndoorData1); %Write to individual file specified
        end
    end
end 

This is extremely resource intensive, however. What could be a more efficient way separating each devices data into a different file?

CodePudding user response:

You should be able to do this, fairly efficiently, with a couple of lines based on the unique function.

%Setup a small sample data table to work with.
exampleData = cell2table({...
    'd1'  1   2; ...
    'd1'  3   4; ...
    'd1'  5   6; ...
    'd1'  7   8; ...
    'd2'  9   10; ...
    'd2'  11  12; ...
    'd3'  13  14; ...
    'd3'  15  16; ...
    'd3'  17  18; ...
    'd3'  19  20; ...
    }, 'VariableNames', { ...
    'DeviceName', 'data1', 'data2'} );

% The "unique" built-in function is pretty efficient, and 
%     outputs some useful secondary outputs.  We're going to use
%     the 3rd argument, that I have names ixs2
[deviceNames, ixs1, ixs2] = unique(  exampleData.DeviceName  );

%Now, based on the "deviceNames", and "ixs2" output, we can just loop
%    through and save output    
for ixDevice = 1:length(deviceNames)
    curDevice = deviceNames{ixDevice};
    curMask = (ixs2 == curDevice);
    
    curData = exampleData(curMask,:);
    
    %Save data here. Save the whole thing at once.
    %    Name, if needed, is: curDevice
    %    Datatable is: curData
    
end

For anyone that is not running a live version of Matlab on the side, the outputs of the unique call in this case are as follows:

%The standard output, a list of unique names
deviceNames =
  3×1 cell array
    {'d1'}
    {'d2'}
    {'d3'}

%A set of indexes, which point from the original set into the new set.
%   Strictly speaking, this doesn't have to be unique. But the function
%   always points to the first one.
ixs1 =
     1
     5
     7

%A set of indexes, which map each element of the original set to the 
%    unique set from output argument #1. This is often the most 
%    useful output. This question is a decent example. It can also
%    be used as the fist input to an "accumarray" function call, which
%    can be incredibly powerful.
ixs2 =
     1
     1
     1
     1
     2
     2
     3
     3
     3
     3
  • Related