Home > Net >  Replace every non letter or number character in a string with another
Replace every non letter or number character in a string with another

Time:07-07

Context

I am designing a code that runs a bunch of calculations, and outputs figures. At the end of the code, I want to save everything in a nice way, so my take on this is to go to a user specified Output directory, create a new folder and then run the save process.

Question(s)

My question is twofold:

  1. I want my folder name to be unique. I was thinking about getting the current date and time and creating a unique name from this and the input filename. This works but it generates folder names that are a bit cryptic. Is there some good practice / convention I have not heard of to do that?

  2. When I get the datetime string (tn = datestr(now);), it looks like that:

tn =

'07-Jul-2022 09:28:54'

To convert it to a nice filename, i replace the '-',' ' and ':' characters by underscores and append it to a shorter version of the input filename chosen by the user. I do that using strrep:

tn = strrep(tn,'-','_');
tn = strrep(tn,' ','_');
tn = strrep(tn,':','_');

This is fine but it bugs me to have to use 3 lines of code to do so. Is there a nice one liner to do that? More generally, is there a way to look for every non letter or number character in a string and replace it with a given character? I bet that's what regexp is there for but frankly I can't quite get a hold on how regexps work.

CodePudding user response:

Your point (1) is opinion based so you might get a variety of answers, but I think a common convention is to at least start the name with a reverse-order date string so that sorting alphabetically is the same as sorting chronologically (i.e. yymmddHHMMSS).

To answer your main question directly, you can use the built-in makeValidName utility which is designed for making valid variable names, but works for making similarly "plain" file names.

str = '07-Jul-2022 09:28:54';
str = matlab.lang.makeValidName(str)
% str = 'x07_Jul_202209_28_54'

Because a valid variable can't start with a number, it prefixes an x - you could avoid this by manually prefixing something more descriptive first.

This option is a bit more simple than working out the regex, although that would be another option which isn't too nasty here using regexprep and replacing non-alphanumeric chars with an underscore:

str = regexprep( str, '\W', '_' ); % \W (capital W) matches all non-alphanumeric chars
% str = '07_Jul_2022_09_28_54'

To answer indirectly with a different approach, a nice trick with datestr which gets around this issue and addresses point (1) in one hit is to use the following syntax:

str = datestr( now(), 30 );
% str = '20220707T094214'

The 30 input (from the docs) gives you an ISO standardised string to the nearest second in reverse-order:

'yyyymmddTHHMMSS' (ISO 8601)

(note the T in the middle isn't a placeholder for some time measurement, it remains a literal letter T to split the date and time parts).

CodePudding user response:

I normally use your folder naming approach with a meaningful prefix, replacing ':' by something else:

folder_name = ['results_' strrep(datestr(now), ':', '.')];

As for your second question, you can use isstrprop:

folder_name(~isstrprop(folder_name, 'alphanum')) = '_';

Or if you want more control on the allowed characters you can use good old ismember:

folder_name(~ismember(folder_name, ['0':'9' 'a':'z' 'A':'Z'])) = '_';
  • Related