I would like to extract integers from strings from a cell array in Matlab. Each string contains 1 or 2 integers formatted as shown below. Each number can be one or two digits. I would like to convert each string to a 1x2 array. If there is only one number in the string, the second column should be -1. If there are two numbers then the first entry should be the first number, and the second entry should be the second number.
'[1, 2]'
'[3]'
'[10, 3]'
'[1, 12]'
'[11, 12]'
Thank you very much!
I have tried a few different methods that did not work out. I think that I need to use regex and am having difficulty finding the proper expression.
CodePudding user response:
You can use str2num
to convert well formatted chars (which you appear to have) to the correct arrays/scalars. Then simply pad from the end 1
element to the 2nd element (note this is nothing in the case there's already two elements) with the value -1
.
This is most clearly done in a small loop, see the comments for details:
% Set up the input
c = { ...
'[1, 2]'
'[3]'
'[10, 3]'
'[1, 12]'
'[11, 12]'
};
n = cell(size(c)); % Initialise output
for ii = 1:numel(n) % Loop over chars in 'c'
n{ii} = str2num(c{ii}); % convert char to numeric array
n{ii}(end 1:2) = -1; % Extend (if needed) to 2 elements = -1
end
% (Optional) Convert from a cell to an Nx2 array
n = cell2mat(n);
If you really wanted to use regex, you could replace the loop part with something similar:
n = regexp( c, '\d{1,2}', 'match' ); % Match between one and two digits
for ii = 1:numel(n)
n{ii} = str2double(n{ii}); % Convert cellstr of chars to arrays
n{ii}(end 1:2) = -1; % Pad to be at least 2 elements
end
But there are lots of ways to do this without touching regex, for example you could erase
the square brackets, split on a comma, and pad with -1
according to whether or not there's a comma in each row. Wrap it all in a much harder to read (vs a loop) cellfun
and ta-dah you get a one-liner:
n = cellfun( @(x) [str2double( strsplit( erase(x,{'[',']'}), ',' ) ), -1*ones(1,1-nnz(x==','))], c, 'uni', 0 );
I'd recommend one of the loops for ease of reading and debugging.