Home > Software engineering >  How to subtract value in one row of pandas Data Frame from value in another row without for loop?
How to subtract value in one row of pandas Data Frame from value in another row without for loop?

Time:01-24

I have a python pandas Data Frame with 2 columns: X, which is an angle (181 evenly-spaced values between 0 and 360), and R1, which is an electrical resistance measured for a given angle. I need to add a column "R2", with the result of the following operation:

R2(X) = R1(X) - R1(180 - X).
Negative values of (180 - X) should map to (540 - X), so R1(180 - 182) == R1(358).

In practice, it means the "R2" column should be in the form (using row numbers):

R1[0] - R1[91]
R1[1] - R1[90]
R1[2] - R1[89]
...
R[91] - R[181]
R[92] - R[180]
R[93] - R[179]
...
R[181] - R[91]

My first guess was to to loop over rows' numbers with condition on which rows to take into account. After reading this question and also that one I realized this is not a preferred solution. df.apply() and df.shift() also doesn't seem to be a choice. Since none of the suggestions mentioned in the posts above seem to be straightforward in my case, I'd like to ask: is it possible to avoid "for" loop and what would be the best solution for this particular problem? Is it possible to select values from "R1" column based on the corresponding value from "X" column?

CodePudding user response:

import pandas as pd

# create example dataframe
df = pd.DataFrame({'A': [1, 2, 3, 4, 5],'B' : [2, 5, 8, 9, 7]})

# subtract value in current row from value in next row
df['A_minus_next'] = df['A'] - df['B'].shift(0)

print(df)

You can use the .shift() method

CodePudding user response:

You can separate the column from the middle point; for each half, substract the reversed of it from itself, and at the end concatenate:

middle = np.ceil(len(df) / 2)

first_half  = df["R1"].iloc[:middle].to_numpy()
second_half = df["R1"].iloc[middle:].to_numpy()

df["R2"] = np.concatenate([first_half - first_half[::-1],
                           second_half - second_half[::-1]])

Reason for using NumPy is because in pandas, arithmetic operations look at index, so subtracting the "reversed" Series from itself would be 0 due to index alignment (you can try df["R1"].iloc[:middle] - df["R1"].iloc[:middle][::-1] and see all zeros).

  • Related