I have a dataframe column with a string of below patterns:
Col A
ABC29SEP2286AB
PQRST29SEP22FUN
I want to split the string such that I get the date to act as a separator, so desired output would be:
ColA ColB ColC
ABC 29SEP22 86AB
PQRST 29SEP22 FUN
Can you please advise how i can use the date part to act as the trigger for splitting the string. The date part will always have 7 characters ddmmmyy.
CodePudding user response:
We can use str.extract
here:
df[["ColA", "ColB", "ColC"]] = df["ColA"].str.extract(r'([A-Z] )(\d{2}[A-Z]{3})(.*)', expand=True)
Check this regex demo to see the pattern working against your data.
CodePudding user response:
You can use str.split
, splitting on the date part, putting a capturing group around it so that we retain the date in the output:
df[['Col A', 'Col B', 'Col C']] = df['Col A'].str.split(r'(\d{2}[A-Z]{3}\d{2})', expand=True, regex=True)
Output:
Col A Col B Col C
0 ABC 29SEP22 86AB
1 PQRST 29SEP22 FUN