I have DataFrame in Pandas like below:
DATA TYPES:
ID - numeric
HOLIDAY - object
YEAR - object
ID HOLIDAY YEAR 111 1 sty 2022 222 20 kwi 2022 333 8 mar 2022 ... ... ... sty - January
kwi - APril
mar - March
And I need to convert above table so as to have full and useful date (as string format).
So, I need to have something like below:
ID | HOLIDAY | YEAR
----|-------------|-------
111 | 01-01-2022 | 2022
222 | 20-02-2022 | 2022
333 | 08-03-2022 | 2022
... | ... | ...
How can I do that in Python Pandas ?
I used somethink like that:
df['HOLIDAY'] = pd.to_datetime(df['HOLIDAY'] " " df['YEAR'] , format='%d %b %Y')
df['HOLIDAY'] = df['HOLIDAY'].dt.strftime('%d-%m-%Y')
but it generate error like the follow: ValueError: time data '1 sty 2022' does not match format '%d %b %Y' (match)
CodePudding user response:
hello you can use this:
d={'sty':'-1','kwi':'-4','mar':'-3'} #creat dict
a=df.HOLIDAY.tolist() # creat list of original holiday
for i in range(len(df)):
for word, replacement in d.items():
a[i] = a[i].replace(word, replacement)# creat a loop that replace the mount by her number
a[i] = a[i].replace(" ", "")# delete the space ex '1 sty' -> '1sty'
df.HOLIDAY=a
l=[]
for i in range(len(df)):
l.append(str(df.HOLIDAY[i]) '-' str(df.YEAR[i]))#loop that concatenat year and holiday
df.HOLIDAY=l# replace holiday in df by new values
df.HOLIDAY=pd.to_datetime(df.HOLIDAY, format="%d-%m-%Y")#transform holiday from str to datetime
CodePudding user response:
Try:
df["HOLIDAY"] = df["HOLIDAY"].apply(
lambda x: x.replace("sty", "January")
.replace("kwi", "April")
.replace("mar", "March")
)
df["HOLIDAY"] = pd.to_datetime(
df["HOLIDAY"] " " df["YEAR"].astype(str)
).dt.strftime("%d-%m-%Y")
print(df)
Prints:
ID HOLIDAY YEAR
0 111 01-01-2022 2022
1 222 20-04-2022 2022
2 333 08-03-2022 2022