I have spent 2 days trying to look for an answer for this but no luck so here i am. Also, i am super new to python.
I have a script that reads in multiple files. Each file has a different date format that i am able to handle using
temp_df['Invoice Date'] = pd.to_datetime(temp_df['Invoice Date'],format='%d/%m/%Y')
I have a few issues that i cant seem to solve:
1.One of my file has 2022-03-17 & 04/03/2022 with (YYYY-MM-DD) & (DD-MM-YYYY) respectively. So what im trying to do is apply different to_datetime() statement for different format and i could not figure out a way for the life of me. I tried to not specify a format but then it gets confused and messes up the format for rest of the dates too. Please note that Data is only for March.
So what i thought to do was for example, if
pd.to_datetime(temp_df['Invoice Date'],format='%d/%m/%Y')`
fails or gives an error, try
pd.to_datetime(temp_df['Invoice Date'],format='%Y/%m/%d')
One of my file is missing a date for a transaction, i want to apply the first of current month for that record. I have tried the below but it applies the date to all records.
if temp_df['Distributor Invoice Date'].isnull(): temp_df['Distributor Invoice Date'] = datetime.date.today().replace(day=1)
I want a new column called Month that uses the date from
temp_df['Invoice Date']
.
Please let me know if anything is not clear and i will respond asap.
Thanks, Waleed
CodePudding user response:
Try:
# 1. Date format
temp_df['Invoice Date'] = pd.to_datetime(temp_df['Invoice Date'], dayfirst=True)
# 2. Fill missing transactions
d = datetime.date.today().replace(day=1)
temp_df['Distributor Invoice Date'] = temp_df['Distributor Invoice Date'].fillna(d)
# 3. New column
df['Month'] = temp_df['Invoice Date'].dt.month