I have two dataframe such as :
Tab1
COL1 COL2 COL3
Sp1 False A
Sp2 False B
Sp3 True X
Sp4 True X
Sp5 False A
Sp6 False C
Tab2
COL1 COL2_bis COL3_bis
Sp1 True X
Sp4 True F
Sp6 True X
And I would like to correct the Tab1 COL2
and COL3
columns compared to Tab2 COL2_bis
and COL3_bis
columns for corresponding COL1
values between the two tabs.
I should then get:
Tab1
COL1 COL2 COL3
Sp1 True X
Sp2 False B
Sp3 True X
Sp4 True F
Sp5 False A
Sp6 True X
Does somene have an idea please ?
Here are the table in dict format if it cna helps:
tab1:
{'COL1': {0: 'Sp1', 1: 'Sp2', 2: 'Sp3', 3: 'Sp4', 4: 'Sp5', 5: 'Sp6'}, 'COL2': {0: False, 1: False, 2: True, 3: True, 4: False, 5: False}, 'COL3 ': {0: 'A', 1: 'B ', 2: 'X ', 3: 'X ', 4: 'A ', 5: 'C '}}
tab2:
{'COL1': {0: 'Sp1', 1: 'Sp4', 2: 'Sp6'}, 'COL2_bis': {0: True, 1: True, 2: True}, 'COL3_bis ': {0: 'X', 1: 'F', 2: 'X'}}
CodePudding user response:
Use DataFrame.update
by COL1
with set to indices and remove _bis
substrings from columns names:
df1 = df1.set_index('COL1')
df2 = df2.set_index('COL1').rename(columns=lambda x: x.replace('_bis', ''))
df1.update(df2)
print (df1)
COL2 COL3
COL1
Sp1 True X
Sp2 False B
Sp3 True X
Sp4 True F
Sp5 False A
Sp6 True X
df = df1.reset_index()
CodePudding user response:
Rename columns of df2
, concatenate the two dataframes and drop duplicates based on COL1
:
df2.columns = df2.columns.str.replace('_bis', '')
out = pd.concat([df1, df2]).drop_duplicates('COL1', keep='last') \
.sort_values('COL1').reset_index(drop=True)
print(out)
# Output:
COL1 COL2 COL3
0 Sp1 True X
1 Sp2 False B
2 Sp3 True X
3 Sp4 True F
4 Sp5 False A
5 Sp6 True X
CodePudding user response:
It sounds like you would like to compute a "right outer join" on the two tables/dataframes. Is that correct?
For example:
corrected_tab_1 = pd.merge(Tab1, Tab2, right_on='COL2_bis')
Official documentation for pd.merge()
:
https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.merge.html
Examples of pd.merge()
using different join strategies:
https://www.analyticsvidhya.com/blog/2020/02/joins-in-pandas-master-the-different-types-of-joins-in-python/#:~:text=Right Join in Pandas,-Similarly, if we&text=Right join, also known as,that are common in both.