MAPPER DATAFRAME
col_data = {'p0_tsize_qbin_':[1, 2, 3, 4, 5] ,
'p0_tsize_min':[0.0, 7.0499999999999545, 16.149999999999977, 32.65000000000009, 76.79999999999973] ,
'p0_tsize_max':[7.0, 16.100000000000023, 32.64999999999998, 76.75, 6759.850000000006]}
map_df = pd.DataFrame(col_data, columns = ['p0_tsize_qbin_', 'p0_tsize_min','p0_tsize_max'])
map_df
in Above data frame is map_df
where column 2 and column 3 is the range and column1 is mapper value to the new data frame .
MAIN DATAFRAME
raw_data = {
'id': ['1', '2', '2', '3', '3','1', '2', '2', '3', '3','1', '2', '2', '3', '3'],
'val' : [3, 56, 78, 11, 5000,37, 756, 78, 49, 21,9, 4, 14, 75, 31,]}
df = pd.DataFrame(raw_data, columns = ['id', 'val','p0_tsize_qbin_mapped'])
df
EXPECTED OUTPUT MARKED IN BLUE
look for val
of df dataframe in map_df min(column1) and max(columns2) where ever it lies get the p0_tsize_qbin_ value.
For Example : from df data frame val = 3 , lies in the range of p0_tsize_min
p0_tsize_max
where p0_tsize_qbin_
==1 . so 1 will return
CodePudding user response:
Try using pd.cut()
bins = map_df['p0_tsize_min'].tolist() [map_df['p0_tsize_max'].max()]
labels = map_df['p0_tsize_qbin_'].tolist()
df.assign(p0_tsize_qbin_mapped = pd.cut(df['val'],bins = bins,labels = labels))
Output:
id val p0_tsize_qbin_mapped
0 1 3 1
1 2 56 4
2 2 78 5
3 3 11 2
4 3 5000 5
5 1 37 4
6 2 756 5
7 2 78 5
8 3 49 4
9 3 21 3
10 1 9 2
11 2 4 1
12 2 14 2
13 3 75 4
14 3 31 3