I am currently trying to extract only the alphabetical portion of the string and exclude the characters in parentheses or the ones that are alphanumeric. Currently when I use my current code it will extract all alphabetical characters including the alphanumeric ones.
df['desc'] = df['description'].str.findall(r'[a-zA-Z] ')
AERONAUTICAL MOBILE (OR) AUS52 AUS57 AUS58 AUS101
How do I only get AERONAUTICAL MOBILE from this string using regex?
CodePudding user response:
Assuming that the all alpha portion the description would always start at the beginning of the string, we can use str.extract
as follows:
df["desc"] = df["description"].str.extract(r'^([a-z] (?: [a-z] )*)', flags=re.I)