I have an excel file that can calculate the last column with a formula. I am trying to replicate this into python code. How can I perform an operation to achieve such results?
Essentially, the last column is supposed to take, for example, row 2 and subtract itself by row 0 (because they share the same industry besides the month, which I am trying to compare the two by).
I tried using a for loop and take the index of row i
then subtract row i
's 'No. of jobs' by the row i-2
's 'No. of jobs' and adding that result to the last column. Then, I would increment i
by 1 to let the operation continue for the next row, but I was unsuccessful.
Period Industry No. of jobs Difference from prev. month
0 January Farm 70200 N/A
1 January Mining 4900 N/A
2 February Farm 70100 -100
3 February Mining 4850 -50
4 March Farm 70200 100
5 March Mining 4600 -250
6 April Farm 70300 100
7 April Mining 5200 600
8 May Farm 70300 0
9 May Mining 5300 100
CodePudding user response:
Try as follows:
df['Difference from prev. month'] = df['No. of jobs'].groupby(df['Industry']).diff()
df['Difference from prev. month']
0 NaN
1 NaN
2 -100.0
3 -50.0
4 100.0
5 -250.0
6 100.0
7 600.0
8 0.0
9 100.0
Name: Difference from prev. month, dtype: float64