Home > database >  Returning of the sum of units in the particular year
Returning of the sum of units in the particular year

Time:10-13

I have a simple task however struggling with it in python.

I have a df with "Freq" column (the sum at the beginning) every year some units will be removed from this, could you help me to build a for loop to return the amount for a particular year:

df = pd.DataFrame({'Delivery Year' : [1976,1977,1978,1979], "Freq" : [120,100,80,60],
                                              "1976" : [10,float('nan'),float('nan'),float('nan')], 
                                              "1977" : [5,3,float('nan'),float('nan')], 
                                              "1978" : [10,float('nan'),8,float('nan')], 
                                              "1979" : [13,10,5,14]
                   })
df

My attempt, however not working..

# Remaining in use
for i in df.columns[2:len(df.columns)]:
    df[i] = df[i-1] -  df[i]

Desired output:


df = pd.DataFrame({'Delivery Year' : [1976,1977,1978,1979], "Freq" : [120,100,80,60],
                                              "1976" : [110,100,80,60], 
                                              "1977" : [105,97,80,60], 
                                              "1978" : [95,97,72,60], 
                                              "1979" : [82,87,67,46]
                   })
df

CodePudding user response:

You can calculate the cumulative sum along the columns axis then subtract this sum from the Freq column to get available amounts for each year

s = df.iloc[:, 2:].fillna(0).cumsum(1).rsub(df['Freq'], axis=0)
df.assign(**s)

   Delivery Year  Freq   1976   1977  1978  1979
0           1976   120  110.0  105.0  95.0  82.0
1           1977   100  100.0   97.0  97.0  87.0
2           1978    80   80.0   80.0  72.0  67.0
3           1979    60   60.0   60.0  60.0  46.0

CodePudding user response:

Try:

df[df.columns[2:]] = df[df.columns[1:]].apply(lambda x: pd.Series(x['Freq'] - x[1:].cumsum()).ffill().fillna(x['Freq']), axis=1)

Output:

   Delivery Year  Freq   1976   1977  1978  1979
0           1976   120  110.0  105.0  95.0  82.0
1           1977   100  100.0   97.0  97.0  87.0
2           1978    80   80.0   80.0  72.0  67.0
3           1979    60   60.0   60.0  60.0  46.0

Here is how you would do it in loop, but as @Shubham Sharma suggested there is no need for looping when you can use pandas directly:

cols = df.columns[2:len(df.columns)]
for index, col in enumerate(cols):
    sub_from = df.columns[2 (index-1)]
    print('col: ', col, 'Sub From: ', sub_from)
    df[col] = (df[sub_from] - df[col]).fillna(df[sub_from])
  • Related