Home > Enterprise >  Mapping value based on two dataframe (error: an only compare identically-labeled Series objects)
Mapping value based on two dataframe (error: an only compare identically-labeled Series objects)

Time:01-06

Table t1
    id  mins    maxs
0   43852   11  11
1   63087   14  15
2   63106   14  15
3   63155   14  15

Table t2
    idx cons
0   1   1.00
1   2   0.95
2   3   0.90
3   4   0.85
4   5   0.80
5   6   0.70
6   7   0.70
7   8   0.65
8   9   0.60
9   10  0.55
10  11  0.50
11  12  0.45
12  13  0.40
13  14  0.35
14  15  0.35
15  16  0.30

I want to sum cons in range min and max for each id (min and max for idx)

but the below error occurs:

error: an only compare identically-labeled Series objects

when I run the below code:

t2.loc[(t2['idx']>= t1['mins']) & (t2['idx']<=t1['maxs']), 'cons'].sum()

I was expecting:

    id  mins    maxs  result
0   43852   11  11    0.50
1   63087   14  15    0.70
2   63106   14  15    0.70
3   63155   14  15    0.70

CodePudding user response:

I would use a groupby.transform to perform the search in t2 for each unique pair of mins/maxs:

t1['result'] = (t1
 .groupby(['mins', 'maxs'])['id'] # column used here doesn't matter
 .transform(lambda g: t2.loc[t2['idx'].between(g.name[0], g.name[1]),
                             'cons'].sum())
)

Output:

      id  mins  maxs  result
0  43852    11    11     0.5
1  63087    14    15     0.7
2  63106    14    15     0.7
3  63155    14    15     0.7
  • Related