I have a dataframe df1:
id1_a id2_a id3_a id1_b id2_b id3_b
a1 a2 a3 b1 b2 b3
aa1 a2 a3 b12 b22 b23
ac1 c2 a3 bc2 bc2 bc3
I want to use it as a dictionary to replace values in columns id1_a, id2_a, id3_a in
this dataframe df2:
id1_a id2_a id3_a
a1 a2 a3
a1 a2 a3
aa1 a2 a3
ac1 c2 a3
so desired result is
id1_a id2_a id3_a
b1 b2 b3
b1 b2 b3
b12 b22 b23
bc2 bc2 bc3
How could I do that? If there was inly one column I would turn it into dictionary and replace values, but what to do in three column case?
CodePudding user response:
If need replace all columns by dictionary created between _a
and _b
columns use:
df1.columns = df1.columns.str.split('_', expand=True)
df1 = df1.stack(0)
df2 = df2.replace(dict(zip(df1['a'], df1['b'])))
print (df2)
id1_a id2_a id3_a
0 b1 b22 bc3
1 b1 b22 bc3
2 b12 b22 bc3
3 bc2 bc2 bc3
If need replace by match columns create Series
for specify it first:
print (df1)
id1_a id2_a id3_a id1_b id2_b id3_b
0 a1 a2 a3 b1 b2 b3
1 aa1 a1 a3 b12 b22 b23
2 ac1 c2 a3 bc2 bc2 bc3
print (df2)
id1_a id2_a id3_a
0 b1 b22 bc3
1 b1 b2 bc3
2 b12 b2 bc3
3 bc2 bc2 bc3
df1.columns = df1.columns.str.split('_', expand=True)
s = (df1.stack(0)
.groupby(level=1)
.apply(lambda x: dict(zip(x['a'], x['b'])))
.add_suffix('_a'))
print (s)
id1_a {'a1': 'b1', 'aa1': 'b12', 'ac1': 'bc2'}
id2_a {'a2': 'b2', 'a1': 'b22', 'c2': 'bc2'}
id3_a {'a3': 'bc3'}
dtype: object
df2 = df2.replace(s)
print (df2)
id1_a id2_a id3_a
0 b1 b22 bc3
1 b1 b2 bc3
2 b12 b2 bc3
3 bc2 bc2 bc3