Home > Net >  How can I validate an SAS library name by using a regex?
How can I validate an SAS library name by using a regex?

Time:11-11

The naming of SAS library has 3 rules:

  1. no more than 8 character;
  2. may consist with underscore, numbers and English letters;
  3. start with underscore or English letters;

Here comes my question: How to validate a string include invalid library name or not using perl regular expression?

The string is consist with words, which are separated by one space, like the following:

sasuser work sashelp
sasuser work 7z sashelp
sasuser work dictionary

7z and dictionary not statisfy the rules, so I want an output, with 0, 1, 1 corresponding with the three input strings.

I have trying this in SAS, but it doesn't work:

data test;
  input string&$42.;
  x=prxmatch('/\b(?=\S )(?![A-Za-z_][A-Za-z0-9_]{0,7})\b/',string);
  put x=;
  cards;
sasuser work sashelp
sasuser work 7z sashelp
sasuser work dictionary
;
run;

Thanks for any hint.


Edit:2022-11-11
I am really looking for a regex way, you may use SAS language or not. I have a thought as following:

  1. Judge if the string contains a word or not;
  2. The word mismatch a regular expression;
  3. The regular expression discribe the rules of SAS library naming;

Is that possible?

CodePudding user response:

You appear to be testing if ANY of the words in the list are valid librefs. Instead test each word in the string separately.

Note that SAS already has a function, NVALID(), to test if a name is valid, but you need to add an additional test to make sure the length is not too long to use as a libref or fileref.

data test;
  input string $80. ;
  do index=1 to countw(string,' ');
    word = scan(string,index,' ');
    nvalid=nvalid(word,'v7') and lengthn(word) in (1:8);
    x=prxmatch('\b[A-Za-z_][A-Za-z0-9_]{0,7}\b/',word);
    output;
  end;
cards;
sasuser work sashelp
sasuser work 7z sashelp
sasuser work dictionary
;

Result

Obs            string             index    word          nvalid    x

  1    sasuser work sashelp         1      sasuser          1      1
  2    sasuser work sashelp         2      work             1      1
  3    sasuser work sashelp         3      sashelp          1      1
  4    sasuser work 7z sashelp      1      sasuser          1      1
  5    sasuser work 7z sashelp      2      work             1      1
  6    sasuser work 7z sashelp      3      7z               0      0
  7    sasuser work 7z sashelp      4      sashelp          1      1
  8    sasuser work dictionary      1      sasuser          1      1
  9    sasuser work dictionary      2      work             1      1
 10    sasuser work dictionary      3      dictionary       0      0

CodePudding user response:

If you want to do this without perl regex then here is a solution:

First let's get some more sample data:

data test;
  input string&$42.;
  cards;
sasuser work  sashelp
sas_user _work 7z sashelp
sasuser work77 dictionary
;
run;

Here, the resulting column "valid" consists of a list of flags (1 for valid, 0 for invalid):

data validation (drop=i txt);
set test;
length valid $12;
do i=1 to countw(string);
  txt=scan(string,i,' ');
  if txt ne '' then do;
    if (length(txt) gt 8 
        or substr(txt,1,1) eq compress(substr(txt,1,1),'_' , 'a')
        or txt ne compress(txt, ,'kan')
        ) 
      then valid=catx(', ',valid,'0');
      else valid=catx(', ',valid,'1');
  end;
end;
run;

Result:

string                    valid
-----------------------------------------
sasuser work  sashelp       1, 1, 0
sas_user _work 7z sashelp   1, 1, 0, 1
sasuser work77 dictionary   1, 1, 0
  • Related