Home > Enterprise >  is there a method in pandas which can detect if a value is in the range of available columns in the
is there a method in pandas which can detect if a value is in the range of available columns in the

Time:06-16

I have a df which looks like this. the structure of the df

df.to_dict() prints this.

{'Score': {0: '100-199', 1: '200-300', 2: '400-500', 3: '500-700'}, '50-60%': {0: 0.2, 1: 0.3, 2: 0.4, 3: 0.5}, '60-70%': {0: 5.6, 1: 6.4, 2: 7.2, 3: 8.4}, '70-80%': {0: 6.8, 1: 7.9, 2: 8.6, 3: 9.5}}

I want to write a function which can detect what value to pick from this df. For example if I put 124,70 I want to get the value 6.8 from it. How can I do that? The pseudocode would look like this.

def get_value(124,70):
    # go through the columns consisting of range and selecting the value that the 
    # arguments pertain to.
    # return that value  

CodePudding user response:

As suggested by @QuangHoang, you can use IntervalIndexes

To convert from your format:

score = df['Score'].str.split('-', expand=True).astype(int)
idx = pd.IntervalIndex.from_arrays(score[0], score[1])

percent = (df.drop(columns='Score').columns.to_series()
             .str.split('[-%]', expand=True)[[0,1]].astype(int)
           )
col = pd.IntervalIndex.from_arrays(percent[0], percent[1])

df2 = df.set_axis(idx).set_axis(col, axis=1)

output:

            (50, 60]  (60, 70]  (70, 80]
(100, 199]       0.2       5.6       6.8
(200, 300]       0.3       6.4       7.9
(400, 500]       0.4       7.2       8.6
(500, 700]       0.5       8.4       9.5

Then use:

df2.loc[124, 70]

output: 5.6

CodePudding user response:

here is one way to do it Create additional columns for the range and then search

var = 220 # value to search
df[['from','to']] = df['Score'].str.split('-', expand=True).astype(int)
df[(var >= df['from'])  & (var <= df['to'] )]
df
  • Related