Home > Mobile >  Assign Date to Missing Cell Based on Dataframe
Assign Date to Missing Cell Based on Dataframe

Time:09-16

I have a dataframe as below. My question is, i want to assign the missing dates based on the first day of the Month available in the dataset. The dates will only be for a single month per file. It wont be multiple months in a single file. One cell will always have a date.

Please note that the files could be from a previous month so i cant use Today's date. I have used

fillna(method='ffill')

But this only works if the first cell has a date in it. If its blank itll make the rest blank as well.

EDIT: The filled dates can be the first of the month or ANY previously populated dates in the column. As long as they are from the same month, it should be fine.

Sample Data:

Other Columns Date Invoice Other Columns
... ABC232 ...
... 5-27-2022 ABC232 ...
... 5-27-2022 ANBFN2323 ...
... SADNF343 ...
... 1232HHH ...

Expected Output:

Other Columns Date Invoice Other Columns
... 5-01-2022 ABC232 ...
... 5-27-2022 ABC232 ...
... 5-27-2022 ANBFN2323 ...
... 5-01-2022 SADNF343 ...
... 5-01-2022 1232HHH ...

Please let me know if you have any questions. I cant find anything to even get me started.

Thanks

CodePudding user response:

import re

# Get the index of the first non-NA Date
idx = df["Date"].first_valid_index()
# Replace the day with 01
fill_value = re.sub(r"(\d )-(\d )-(\d )", r"\1-01-\3", df.loc[idx, "Date"])
# Fill
df["Date"] = df["Date"].fillna(fill_value)
  • Related