I have this scenario in mind but I don't know how to go about it.
When a value in column changeme
is equal to the one from the lookforme
column on the same row in df1
, I would like to lookup that value on the matchme
column of df2
, grab the corresponding grabme
value and overwrite the one in changeme
on df1
.
In case of multiple occurrences of lookforme
found on df2
, match just the first one.
I tried a few methods but couldn't get anywhere, had complaints about the DataFrames being uneven when I tried factorize, etc.
changeme name qty lookforme
0 1296477 ItemName2 3.16 1296477
1 62328 ItemName1 7.94 62328
2 233201 ItemName3 5.56 233201
3 66074000105 ItemName4 6.88 233694
matchme grabme
0 62328 35422028768
1 162592 62673001137
2 163200 15422029895
3 164123 45492029154
4 166596 35422031737
5 232264 75372043938
6 232267 95462043655
7 233201 35422044775
8 233201 24613628530
9 233226 35412044812
10 233694 66074000105
11 235466 35422042863
12 272021 85422052947
13 272691 35422053340
14 272872 27536196938
15 273729 39456023149
16 284848 27536197195
17 284861 62672008054
18 1296477 35421853202
19 1296477 75841596436
20 2296518 97630194508
Expected result:
changeme name qty lookforme
0 35421853202 ItemName2 3.16 1296477
1 35422028768 ItemName1 7.94 62328
2 35422044775 ItemName3 5.56 233201
3 66074000105 ItemName4 6.88 233694
CodePudding user response:
Let's try the following:
Filter "lookforme" values that match "changeif=look4m" in df1
and find those values among "matchme-ascend" values in df2
using set_index
loc
. Since some of the "lookforme" values occur multiple times in df2
, drop the duplicates and keep the first.
Then update "changeif=look4m" values in df1
using mask
:
df1 = df1.set_index('changeif=look4m')
cond = df1['lookforme'] == df1.index
s = df2.set_index('matchme-ascend')['grabme'].loc[df1.index[cond]]
df1.index = df1.index.to_series().mask(cond, s[~s.index.duplicated()])
df1 = df1.reset_index()
Output:
changeif=look4m name qty lookforme
0 35421853202 ItemName2 3.16 1296477
1 35422028768 ItemName1 7.94 62328
2 35422044775 ItemName3 5.56 233201
3 66074000105 ItemName4 6.88 233694
CodePudding user response:
you can also try this:
res = pd.merge(df1, df2,
left_on='lookforme',
right_on='matchme-ascend').drop_duplicates('lookforme').reset_index(drop=True)
# you have duplicates in df2 so we should-^^^^^^^^^^^^^^^
df1['changeif=look4m'] = res.apply(lambda x: x['grabme']
if x['changeif=look4m']==x['lookforme']
else x['changeif=look4m'], axis=1)
print(df1)
'''
changeif=look4m name qty lookforme
0 35421853202 ItemName2 3.16 1296477
1 35422028768 ItemName1 7.94 62328
2 35422044775 ItemName3 5.56 233201
3 66074000105 ItemName4 6.88 233694