I have two dataframes, df1 and df2. df1 is a table of labels set between ranges:
Top Bottom Label ID
0 0.0 2.2 A Z-1
1 2.2 6.6 B Z-1
2 6.6 9.1 C Z-1
3 0.0 1.2 A Z-2
4 1.2 4.5 B Z-2
5 6.6 9.1 C Z-2
Where the ID column relates to a higher order grouping. Both df1 and df2 contain shared ID's and I want to apply the labels from df1 to df2.
df2 is a table of regularly sampled data:
Samp Var ID
0 0.0 157 Z-1
1 0.5 226 Z-1
2 1.5 843 Z-1
3 2.0 999 Z-1
4 2.5 142 Z-1
5 3.0 167 Z-1
6 0.0 157 Z-2
7 0.5 226 Z-2
8 1.5 111 Z-2
9 2.0 666 Z-3
The desired output would look like this:
Samp Var ID Label
0 0.0 157 Z-1 A
1 0.5 226 Z-1 A
2 1.5 843 Z-1 A
3 2.0 999 Z-1 A
4 2.5 142 Z-1 B
5 3.0 167 Z-1 B
6 0.0 157 Z-2 A
7 0.5 226 Z-2 A
8 1.5 111 Z-2 B
9 2.0 666 Z-3 B
The issue I am having when looking for other answers is that I first need to match the ID's and then apply the labels within the range between the top and bottom columns. Any help would be appreciated!
CodePudding user response:
here is one way to do it Z-3 doesn't exists in your reference table and hence its null.
Idea is to merge on ID and then query the ones where Samp falls within the range, and then its just a cleanup of result set
df2.merge(df, on='ID', how='left', suffixes=("","_y")
).query('(Samp>=Top & Samp<=Bottom) | Label.isna()'
).reset_index().drop(
columns=['index','Top','Bottom']
)
Samp Var ID Label
0 0.0 157 Z-1 A
1 0.5 226 Z-1 A
2 1.5 843 Z-1 A
3 2.0 999 Z-1 A
4 2.5 142 Z-1 B
5 3.0 167 Z-1 B
6 0.0 157 Z-2 A
7 0.5 226 Z-2 A
8 1.5 111 Z-2 B
9 2.0 666 Z-3 NaN