I have a single CSV with data uploaded from multiple IOT sensors (all within the same file), each device's row distinguished by a unique numeric ID. I convert this into a MATLAB table using the readtable()
method. I can then organize the file by grouping device ID's separately by using the sortrows()
method on the Device ID column.
However, how would I split the grouped devices into seperate tables? Currently I use the following algorithm:
for g = 1:numDevices %create a new file for each device type
outputFileIndoorData1 = sprintf("C:\Users\Documents\duetTestFile%d",g);
for h = 1:height(indoorDataSorted) %Iterate through entire CSV, splitting by device ID
if strcmp(indoorDataSorted.device_id(h),deviceIDs(g,1))
writelines(indoorDataSorted(h,1:17),outputFileIndoorData1); %Write to individual file specified
end
end
end
This is extremely resource intensive, however. What could be a more efficient way separating each devices data into a different file?
CodePudding user response:
You should be able to do this, fairly efficiently, with a couple of lines based on the unique
function.
%Setup a small sample data table to work with.
exampleData = cell2table({...
'd1' 1 2; ...
'd1' 3 4; ...
'd1' 5 6; ...
'd1' 7 8; ...
'd2' 9 10; ...
'd2' 11 12; ...
'd3' 13 14; ...
'd3' 15 16; ...
'd3' 17 18; ...
'd3' 19 20; ...
}, 'VariableNames', { ...
'DeviceName', 'data1', 'data2'} );
% The "unique" built-in function is pretty efficient, and
% outputs some useful secondary outputs. We're going to use
% the 3rd argument, that I have names ixs2
[deviceNames, ixs1, ixs2] = unique( exampleData.DeviceName );
%Now, based on the "deviceNames", and "ixs2" output, we can just loop
% through and save output
for ixDevice = 1:length(deviceNames)
curDevice = deviceNames{ixDevice};
curMask = (ixs2 == curDevice);
curData = exampleData(curMask,:);
%Save data here. Save the whole thing at once.
% Name, if needed, is: curDevice
% Datatable is: curData
end
For anyone that is not running a live version of Matlab on the side, the outputs of the unique
call in this case are as follows:
%The standard output, a list of unique names
deviceNames =
3×1 cell array
{'d1'}
{'d2'}
{'d3'}
%A set of indexes, which point from the original set into the new set.
% Strictly speaking, this doesn't have to be unique. But the function
% always points to the first one.
ixs1 =
1
5
7
%A set of indexes, which map each element of the original set to the
% unique set from output argument #1. This is often the most
% useful output. This question is a decent example. It can also
% be used as the fist input to an "accumarray" function call, which
% can be incredibly powerful.
ixs2 =
1
1
1
1
2
2
3
3
3
3