Home > OS >  about pandas dataframe reverse column's data order
about pandas dataframe reverse column's data order

Time:05-01

Code

import pandas as pd
import numpy as np

df = pd.DataFrame({"K1":[1,2,3],"K2":[4,5,6]})
df["result"] = (df["K1"].shift(1)*df["K2"]).fillna(1)[::-1] #line A
print(df["result"])
print((df["K1"].shift(1)*df["K2"]).fillna(1)[::-1]) #line B
print((df["K1"].shift(1)*df["K2"]).fillna(1)) # line C

Output

0     1.0
1     5.0
2    12.0
Name: result, dtype: float64

2    12.0
1     5.0
0     1.0
dtype: float64

0     1.0
1     5.0
2    12.0
dtype: float64

Why column result not get reverse order in line A but it work in line B?

CodePudding user response:

In line A, Pandas try to align the indices of series on RHS with LHS automatically.

A workaround for this is to remove the index information from the RHS before assigning to result, this can be done by converting the series on RHS to list or numpy array:

df["result"] = (df["K1"].shift(1) * df["K2"]).fillna(1)[::-1].tolist()

Result:

print(df)

   K1  K2  result
0   1   4    12.0
1   2   5     5.0
2   3   6     1.0

CodePudding user response:

By using [::-1], you are reversing the Series, including the index (as you can see by your prints). So, the index-value pairs remain the same. Pandas will use the Series index to align the it to the existing index in the df DataFrame. To solve this, you need to assign the raw values to the column, not a Pandas Series, e.g. by using .to_numpy().

df["result"] = (df["K1"].shift(1)*df["K2"]).fillna(1).to_numpy()[::-1] #line A
  • Related