I have some prepaid sales orders and need to assign the sales value to the subsequent months according to the number of months the sales order cover (paid/month). the dataframe looks like this:
order number of months year start month start year paid
1 2 2021 10 2021 300
2 3 2021 10 2021 300
3 1 2021 11 2021 50
.... and it should look like
order months year start_month start_year paid 2021_10 2021_11 2021_12
1 2 2021 10 2021 300 150 150
2 3 2021 10 2021 300 100 100 100
3 1 2021 11 2021 50 50
....
Thank you!
CodePudding user response:
I am assuming you are using pandas.
First, you will need to get all the combinations of months and years in which there were sales. This depends on your edge cases and data. but for the sake of the example:
date_combinations = [(10,2021), (11,2021), (12,2021)]
Now, the way to dynamically add columns and fill them based on multiple other rows is to use the 'apply' function on the entire data frame and get the rows like this:
def fill_date_col(row):
if date[0] < row["start_month"] row["months"] and date[0] >= row["start_month"] and row["start_year"] == date[1] :
return row["paid"]/row["months"]
else:
return None
for date in date_combinations:
df[f"{date[0]}_{date[1]}"] = df.apply(lambda row: fill_date_col(row), axis=1)
and this is the result
order months start_month start_year paid 10_2021 11_2021 12_2021
0 1 2 10 2021 300 150.0 150.0 NaN
1 2 3 10 2021 300 100.0 100.0 100.0
2 3 1 11 2021 50 NaN 50.0 NaN
Of course, you need the adjust this with regard to your span of dates and edge cases, especially if some sales span over a year.