Home > OS >  How to refer to self in pandas subsetting
How to refer to self in pandas subsetting

Time:05-09

When I'm exploring data in an ad hoc way, I often have code like this:

X = (adj_all.o.diff(1) / adj_none.o.diff(1)).diff(1)
print(X[X > 0])

Is there a way to do this in a single line in an easy way? The following works but is verbose:

(adj_all.o.diff(1) / adj_none.o.diff(1)).diff(1)[(adj_all.o.diff(1) / adj_none.o.diff(1)).diff(1) > 0]

I want something like this:

(adj_all.o.diff(1) / adj_none.o.diff(1)).diff(1)[self > 0]

Note that this isn't production code. It is part of ad hoc exploration where iteration speed is important to results, which is why I wish to be able to do this common thing in a single line.

CodePudding user response:

You can use pipe:

(adj_all.o.diff(1) / adj_none.o.diff(1)).diff(1).pipe(lambda x: x[x>0])

CodePudding user response:

You can try walrus operator introduced in Python 3.8

# Thanks for richardec pointing out list symbol [] is not necessary here
df, res = [(x := (adj_all.o.diff(1) / adj_none.o.diff(1)).diff(1)), x[x>0]]

# or

res = (x := (adj_all.o.diff(1) / adj_none.o.diff(1)).diff(1))[x>0]

CodePudding user response:

One of the many uses of .loc is exactly this. You can pass it a lambda function, which will take the dataframe as a parameter and return a mask to filter by:

(adj_all.o.diff(1) / adj_none.o.diff(1)).diff(1).loc[lambda x:x>0]
  • Related