Home > Mobile >  Calculations inside loop with pandas
Calculations inside loop with pandas

Time:07-03

I've this data set:

data = {'index': [4, 17, 24, 36, 42],
    'High': [805.000000, 1094.939941, 1243.489990, 1201.949951, 1172.839966],
}

And I would like to get a slope, like:

test = pd.DataFrame(data)

for i in range(len(test)):
    test.loc[:,'slope'] = (test.loc[i 1,'High'] - test.loc[i,'High'])   / (test.loc[i 1,'index'] - test.loc[i,'index'])

print(test)

Seems that I'm going out of the boundaries of the loop, but how can I code this in order to get the first row blank and fill the next?

If I do the same code without the 1 and use i instead it works, gives a 0/0 (Nan), but works.

The expected output should be: expected output

CodePudding user response:

Just subtract the 1 from the range, so for loop will not go out of the boundaries

for i in range(len(test)-1):
    test.loc[i 1,'slope'] = round((test.loc[i 1,'High'] - test.loc[i,'High'])   / (test.loc[i 1,'index'] - test.loc[i,'index']),2)

Better solution would be using Shift function , as for loop will take longer time for the large dataset -

test['slope'] = round((test['High']-test['High'].shift(1)) / (test['index']-test['index'].shift(1)),2)
test

CodePudding user response:

Just give it a condition, if index is zero then skip it and change the calculation a bit. Also i spot some human error, u type test.loc[:,'slope'] instead test.loc[i,'slope']


for i in range(len(test)):
    test.loc[i,'slope'] = 0 if i==0 else (test.loc[i,'High'] - test.loc[i-1,'High'])   / (test.loc[i,'index'] - test.loc[i-1,'index'])

CodePudding user response:

A whole-column way to compute this is like this:

We can use diff to make a series of differences vs the previous value:

test['index'].diff()
0     NaN
1    13.0
2     7.0
3    12.0
4     6.0
Name: index, dtype: float64

Using that we can compute the High difference over the index difference per step:

test['High'].diff() / test['index'].diff()
0          NaN
1    22.303072
2    21.221436
3    -3.461670
4    -4.851664
dtype: float64

It's an arbitrary choice IMO about where the index alignment should be - should this sequence start at index 0 or 1? But what you expect in the question is that it starts with 1, like in the result here.

  • Related