Home > Blockchain >  split a string in a column when it encounters date
split a string in a column when it encounters date

Time:10-01

I have a dataframe column with a string of below patterns:

Col A

ABC29SEP2286AB

PQRST29SEP22FUN

I want to split the string such that I get the date to act as a separator, so desired output would be:

ColA ColB ColC

ABC 29SEP22 86AB

PQRST 29SEP22 FUN

Can you please advise how i can use the date part to act as the trigger for splitting the string. The date part will always have 7 characters ddmmmyy.

CodePudding user response:

We can use str.extract here:

df[["ColA", "ColB", "ColC"]] = df["ColA"].str.extract(r'([A-Z] )(\d{2}[A-Z]{3})(.*)', expand=True)

Check this regex demo to see the pattern working against your data.

CodePudding user response:

You can use str.split, splitting on the date part, putting a capturing group around it so that we retain the date in the output:

df[['Col A', 'Col B', 'Col C']] = df['Col A'].str.split(r'(\d{2}[A-Z]{3}\d{2})', expand=True, regex=True)

Output:

   Col A    Col B Col C
0    ABC  29SEP22  86AB
1  PQRST  29SEP22   FUN
  • Related