Home > Software engineering >  match pattern excluding charactere
match pattern excluding charactere

Time:10-20

i have the following situation. the character ";" is used as separator but there are some unexpected ";" in the values like valu;2 or va;ue4 in this string :

...;01;value1;02;valu;2;03;value3;04;va;ue4;....

with the pattern \d\d;.{6}; it returns all the blocks but I would like to know by looping each block and return True/False if ; is in the value .{6}, this way i will obtain 2 lists :

1.these having ; in the value .{6}

2.these not having ; in the value .{6}

the value isn't only alphanumeric, it can accept extra characters (* $ | ) but ; is not allowed in this usecase.

i tried to add [^;] but without success

how can i do ?

Thank you

CodePudding user response:

You can match those that contain no ; into one capturing group and those that have a ; into another. Then, you can check the captured group values to see what you actually match.

\d\d;(?:([^;\s]{6});|(\S{6});)

See the regex demo. Here, value1 and value3 are in Group 1, so no ; is present in those values. valu;2 and va;ue4 are in Group 2, so they contain a ; (as there is a match, and the first group did not match, the group pattern of which is the same except for ; support).

CodePudding user response:

Values without ; can be obtained with this expression: \d\d;[^;]{6}

Values with ; can be obtained with this expression: \d\d;(?=[^;]{0,5};).{6}

CodePudding user response:

Thank you Wiktor Stribizew your regex works and returns 2 groups but i don't really know how to implement using python. Anyway, with response of the fourth bird, i will use 2nd pattern and loop over list of blocks found with 1st pattern, like this:

myString = ';01;value1;02;valu;2;03;value3;04;va;ue4;' 
pattern1 = re.compile(r'\d\d;.{6};')
listOfBlocks = pattern1.findall(myString) 
pattern2 = re.compile(r'\d\d;[^;]{6};')
for block in listOfBlocks : 
    if bool(re.search(pattern2, block )) is True :
         listeOK.append(block) 
    else : 
         listeKO.append(block)
  • Related