I have a simple task however struggling with it in python.
I have a df with "Freq" column (the sum at the beginning) every year some units will be removed from this, could you help me to build a for loop to return the amount for a particular year:
df = pd.DataFrame({'Delivery Year' : [1976,1977,1978,1979], "Freq" : [120,100,80,60],
"1976" : [10,float('nan'),float('nan'),float('nan')],
"1977" : [5,3,float('nan'),float('nan')],
"1978" : [10,float('nan'),8,float('nan')],
"1979" : [13,10,5,14]
})
df
My attempt, however not working..
# Remaining in use
for i in df.columns[2:len(df.columns)]:
df[i] = df[i-1] - df[i]
Desired output:
df = pd.DataFrame({'Delivery Year' : [1976,1977,1978,1979], "Freq" : [120,100,80,60],
"1976" : [110,100,80,60],
"1977" : [105,97,80,60],
"1978" : [95,97,72,60],
"1979" : [82,87,67,46]
})
df
CodePudding user response:
You can calculate the cumulative sum along the columns axis then subtract this sum from the Freq
column to get available amounts for each year
s = df.iloc[:, 2:].fillna(0).cumsum(1).rsub(df['Freq'], axis=0)
df.assign(**s)
Delivery Year Freq 1976 1977 1978 1979
0 1976 120 110.0 105.0 95.0 82.0
1 1977 100 100.0 97.0 97.0 87.0
2 1978 80 80.0 80.0 72.0 67.0
3 1979 60 60.0 60.0 60.0 46.0
CodePudding user response:
Try:
df[df.columns[2:]] = df[df.columns[1:]].apply(lambda x: pd.Series(x['Freq'] - x[1:].cumsum()).ffill().fillna(x['Freq']), axis=1)
Output:
Delivery Year Freq 1976 1977 1978 1979
0 1976 120 110.0 105.0 95.0 82.0
1 1977 100 100.0 97.0 97.0 87.0
2 1978 80 80.0 80.0 72.0 67.0
3 1979 60 60.0 60.0 60.0 46.0
Here is how you would do it in loop, but as @Shubham Sharma suggested there is no need for looping when you can use pandas directly:
cols = df.columns[2:len(df.columns)]
for index, col in enumerate(cols):
sub_from = df.columns[2 (index-1)]
print('col: ', col, 'Sub From: ', sub_from)
df[col] = (df[sub_from] - df[col]).fillna(df[sub_from])