I'm trying to exclude a character once (and everything after):
here is some example
- val=aabbcc,val2==aabb,val3=aa==bb, val4=a=bccc
the pattern is: (any character except =) = (any character)
,here is the result that I wanted :
- val='aabbcc',val2='=aabb',val3='aa==bb', val4='a=bccc'
I tried : ([^=] )=([^=] ) and ([^=] )=(.*) but it doesn't work
CodePudding user response:
You can use
([^,=] )=(.*?)(?=,[^,=] =|$)
See the regex demo.
Details:
([^,=] )
- Group 1: one or more chars other than comma and=
=
-=
char(.*?)
- Group 2: any zero or more chars other than line break chars as few as possible(?=,[^,=] =|$)
- a location that is either immediately followed with a comma and then zero or more chars other than,
and=
and then a=
char, or end of string.
If you need to get rid of leading/trailing whitespaces, you may use this more complex pattern:
([^\s,=](?:[^,=]*[^,=\s])?)\s*=\s*(\S.*?)?\s*(?=,[^,=] =|$)
See this regex demo.
See the Python demo:
import re
text = "val=aabbcc,val2==aabb,val3=aa==bb, val4=a=bccc"
result = re.findall(r'([^\s,=](?:[^,=]*[^,=\s])?)\s*=\s*(\S.*?)?\s*(?=,[^,=] =|$)', text)
print(dict(result))
Output:
{'val': 'aabbcc', 'val2': '=aabb', 'val3': 'aa==bb', 'val4': 'a=bccc'}