Home > Blockchain >  How to reverse engineer a binned column in pandas
How to reverse engineer a binned column in pandas

Time:02-16

I have the following pandas DataFrame in which col1 is binned (that binning is the result of a pandas binning):

d = {'col1': ['(-99999.0, -99998.0)', '(-99998.0, 1.0)','(1.0, 10.0)']}
df = pd.DataFrame(data=d)
print(df)

The dataset looks like this:

                   col1
0  (-99999.0, -99998.0)
1       (-99998.0, 1.0)
2           (1.0, 10.0)

I need to reverse engineer the binned column so that I get a list called myBinner which looks like this:

myBinner = [-99998,1,10]

How can I do it?

CodePudding user response:

You can use np.ravel and np.unique:

import numpy as np

myBinner = np.unique(np.ravel(pd.eval(df['col1'])))[1:].tolist()
print(myBinner)

# Output
[-99998.0, 1.0, 10.0]

CodePudding user response:

Or even shorter:

myBinner = [b[1] for b in pd.eval(df['col1'])]

CodePudding user response:

You could also map ast.literal_eval and use list comprehension to filter:

import ast
myBinner = [x for _, x in map(ast.literal_eval, df['col1'])]

Output:

[-99998.0, 1.0, 10.0]
  • Related