I have a dataframe_
df
date
13MAY2022
13MAY2022
13MAY2022
13MAY2022
How can we give space between these objects?
My expectations like this_
df
date modified_date
13MAY2022 13 MAY 2022
13MAY2022 13 MAY 2022
13MAY2022 13 MAY 2022
13MAY2022 13 MAY 2022
CodePudding user response:
Here is another str.replace
approach using lookarounds:
regex = r'(?<=[A-Z])(?![A-Z])|(?<![A-Z])(?=[A-Z])'
df["modified_date"] = df["date"].str.replace(regex, r' ', regex=True)
Here is a regex demo showing that the replacement logic is working.
CodePudding user response:
Use Series.str.replace
by non digits with add space before and after same values:
df['modified_date'] = df['date'].str.replace(r'(\D )', r' \1 ', regex=True)
#if need append and prepend space only for uppercase letters
df['modified_date'] = df['date'].str.replace(r'([A-Z] )', r' \1 ', regex=True)
print (df)
date modified_date
0 13MAY2022 13 MAY 2022
1 13MAY2022 13 MAY 2022
2 13MAY2022 13 MAY 2022
3 13MAY2022 13 MAY 2022
With converting to datetimes:
df['modified_date'] = pd.to_datetime(df['date']).dt.strftime('%d %B %Y')
print (df)
date modified_date
0 13MAY2022 13 May 2022
1 13MAY2022 13 May 2022
2 13MAY2022 13 May 2022
3 13MAY2022 13 May 2022
df['modified_date'] = pd.to_datetime(df['date']).dt.strftime('%d %B %Y').str.upper()
print (df)
date modified_date
0 13MAY2022 13 MAY 2022
1 13MAY2022 13 MAY 2022
2 13MAY2022 13 MAY 2022
3 13MAY2022 13 MAY 2022
CodePudding user response:
One way to do it:
import dateutil.parser
df['modified_date'] = df['date'].apply(
lambda x: dateutil.parser.parse(x).strftime("%d %b %Y").upper())
dateutil.parser.parse(x)
infers a datetime object from the string, strftime("%d %b %Y")
turns the datetime object back into a string in the form "day month year", and then I put upper()
at the end to match the original case. If you have dates not in this format, though, this won't give a result in the original format.