So I have a code where I need to create a copy of each row if the 'Date_Numerical' is under 30.5 (average days per month). What I would like is to subtract the 'Date_Numerical' by 30 until the column value for the 'SCU_KEY' is under 30.5. Here is the code I used that would create one copy and an example dataframe, but not more than one:
def func(row):
if row['Date_Numerical'] > 30:
row2 = row.copy()
return pd.concat([row, row2], axis=1)
return row
df_2 = pd.concat([func(row) for _, row in df.iterrows()], ignore_index=True, axis=1).T
Original output:
df = pd.DataFrame({'SCU_KEY': [3, 4, 5, 6],
'Date_Numerical': [70, 20, 15, 110]})
Desired Output:
df_2 = pd.DataFrame({'SCU_KEY': [3, 3, 3, 4, 5, 6, 6, 6, 6],
'Date_Numerical': [70, 40, 10, 20, 15, 110, 80, 50, 20]})
CodePudding user response:
Try with apply
and explode
:
df["Date_Numerical"] = df["Date_Numerical"].apply(lambda x: list(range(x, 0, -30)))
df = df.explode("Date_Numerical").reset_index(drop=True)
>>> df
SCU_KEY Date_Numerical
0 3 70
1 3 40
2 3 10
3 4 20
4 5 15
5 6 110
6 6 80
7 6 50
8 6 20
If you want a difference of 30.5, you could use:
df["Date_Numerical"] = df["Date_Numerical"].apply(lambda x: [i/10 for i in range(x*10, 0, -305)])
df = df.explode("Date_Numerical").reset_index(drop=True)
>>> df
SCU_KEY Date_Numerical
0 3 70.0
1 3 39.5
2 3 9.0
3 4 20.0
4 5 15.0
5 6 110.0
6 6 79.5
7 6 49.0
8 6 18.5