I have the following text:
MSH 1A C3
MSH B4-14 c3-1
AU1 C4 2
MA2A C1 1
And I want to take this information from it:
MSH 1A
MSH B4-14
AU1
MA2A
I tried this regex to highlight the C's:
(( C[0-9].*)|( c[0-9].*))
How can I match everything except for what I highlighted in my regex? This needs to be a one-line regex.
CodePudding user response:
You can shorten the pattern using a charaxcter class [Cc]
.
Using replace with an empty string:
df_elements['POINT_ID'] = df_elements['POINT_ID'].str.replace(r'\s[Cc]\d.*', "")
Using extract with a capture group:
df_elements['POINT_ID'] = df_elements['POINT_ID'].str.extract('^(.*?)(?=\s*[Cc]\d)', expand=False)
Both will result in:
POINT_ID
0 MSH 1A
1 MSH B4-14
2 AU1
3 MA2A