Returning a value from dataframe using columns as a range-CodePudding

Factor Dataframe:

Code	Min	Max	Factor
A	0	100	1
A	101	300	2
A	301	800	3
A	800	NaN	4
B	0	100	2
B	101	300	4
B	301	800	6
B	800	NaN	8

The table above is used to return a factor based on two values, the 'Code' and a specific number. A specific value is provided, such as 5, 302, 253, 8000 and if that value falls between the min and max, the factor is returned.

For example:

A, 355 = 3
B, 9000 = 8

I have a separate table with several thousand lines with values which need to have the factor assigned.

Dataframe:

Item	Code	Value
001	A	1
002	A	322
003	B	21
004	A	342
005	A	32
006	B	7666
007	B	10000
008	B	86

How do you use the pandas tool set to return the desired factor and append the relevant factor as a new column?

CodePudding user response：

Use left join in DataFrame.merge with replace missing value to np.inf and then filter in Series.between and possible missing values in Min, if no match in boolean indexing:

print (df1)
  Code   Val
0    A   355
1    B  9000
2    C     5

print (df2)
  Code  Min    Max  Factor
0    A    0  100.0       1
1    A  101  300.0       2
2    A  301  800.0       3
3    A  800    NaN       4
4    B    0  100.0       2
5    B  101  300.0       4
6    B  301  800.0       6
7    B  800    NaN       8

df = df1.merge(df2.fillna({'Max':np.inf}), how='left', on='Code')

df3 = (df[df['Val'].between(df['Min'], df['Max']) | df['Min'].isna()]
             .drop(['Min','Max'], axis=1))
print (df3)
  Code   Val  Factor
2    A   355     3.0
7    B  9000     8.0
8    C     5     NaN

Code	Min	Max	Factor
A	0	100	1
A	101	300	2
A	301	800	3
A	800	NaN	4
B	0	100	2
B	101	300	4
B	301	800	6
B	800	NaN	8

Code	Min	Max	Factor
A	0	100	1
A	101	300	2
A	301	800	3
A	800	NaN	4
B	0	100	2
B	101	300	4
B	301	800	6
B	800	NaN	8

Code	Min	Max	Factor
A	0	100	1
A	101	300	2
A	301	800	3
A	800	NaN	4
B	0	100	2
B	101	300	4
B	301	800	6
B	800	NaN	8