My excel spreadsheet is for the form as below.
A | B |
---|---|
Part 1- 20210910 | 55 |
Part 2- 20210829 | 45 |
Part 3- 20210912 | 2 |
I would like to take the strings from Column A "Part A- 20210910" but read it using Pandas as "2021/09/10", a date format. How could I implement this?
CodePudding user response:
My beginner way of doing it:
import pandas as pd
df = pd.read_excel('file_name.xlsx')
df['A'] = df['A'].apply(lambda x: x.split('-')).apply(lambda x: x[1]).apply(lambda x: pd.to_datetime(str(x), format='%Y%m%d'))
Output
CodePudding user response:
IIUC:
df['A'] = df['A'].str.extract(r'(\d{8})').astype('datetime64')
print(df)
# Output:
A B
0 2021-09-10 55
1 2021-08-29 45
2 2021-09-12 2