I have two data frames as given below. The master data:
df1 = pd.DataFrame(
data=
[[['mayor', 'lord'], 25],
[['prince', 'sheriff'], 33],
[['king george', 'queen'], 32],
[['bristol, counsellor'], 43],
[['exchequer'], 45]],
columns=['V1', 'V2']
)
df1.update(df1[['V1']].applymap('{}'.format))
print(df1)
and the other one is:
df2 = pd.DataFrame(
data=
[['mayor', 55],
['sheriff', 54],
['counsellor', 53],
['exchequer', 50]],
columns=['Var1', 'Var2']
)
print(df2)
I want to map Var2 from df2 in the master data if V1 contains a corresponding Var1 value. That is my end goal dataframe is:
dfgoal = pd.DataFrame(
data=
[[['mayor', 'lord'], 25, 55],
[['prince', 'sheriff'], 33, 54],
[['king george', 'queen'], 32, ],
[['bristol, counsellor'], 43, 53],
[['exchequer'], 45, 50]],
columns=['V1', 'V2', 'V3']
)
dfgoal.update(dfgoal[['V1']].applymap('{}'.format))
print(dfgoal)
I have tried quite a few things from Stack Overflow (like this) but as the V1 is a list object and the match can be in any element in the list (like the counsellor obs), I am unsuccessful so far.
CodePudding user response:
I think this solves it:
df1 = pd.DataFrame(
data=
[[['mayor', 'lord'], 25],
[['prince', 'sheriff'], 33],
[['king george', 'queen'], 32],
[['bristol', 'counsellor'], 43],
[['exchequer'], 45]],
columns=['V1', 'V2']
)
# df1.update(df1[['V1']].applymap('{}'.format))
print(df1)
df2 = pd.DataFrame(
data=
[['mayor', 55],
['sheriff', 54],
['counsellor', 53],
['exchequer', 50]],
columns=['Var1', 'Var2']
)
print(df2)
mapper = df2.set_index('Var1')['Var2'].to_dict()
def get_V3(l):
for x in l:
if x in mapper:
return mapper[x]
return None
df1['V3'] = df1['V1'].apply(get_V3)
I'm assuming if two or more elements appear in df2, then we use the first one, hence the return inside the loop.