Table I
c1 | c2 | c3 |
---|---|---|
Ant | Bee | Apple |
A | B | C |
4 | 5 | 6 |
Table II
A1 | A2 | c3 |
---|---|---|
G | H | Apple |
H | s | C |
8 | asd | 4 |
I would like to know if we can compare these 2 excel workbook (comparison based on the column c3) and extract only the similar values along with the entire row value of that matched value using Pandas
Expected Output:
c1 | c2 | c3 | A1 | A2 |
---|---|---|---|---|
Ant | Bee | Apple | G | H |
A | B | C | H | s |
CodePudding user response:
Here's my solution. Basically, you should:
merge dfs with outer join on "c3"
drop all rows that have any NaN values
import pandas as pd my_df = pd.DataFrame({'c1': ['Ant', 'A', 4], 'c2': ['Bee', 'B', 5], 'c3': ['Apple', 'C', 6] }) my_df2 = pd.DataFrame({'A1': ['G', 'H', 8], 'A2': ['H', 's', 'asd'], 'c3': ['Apple', 'C', 4] }) final = my_df.merge(my_df2, on='c3', how='outer').dropna()
Result:
c1 | c2 | c3 | A1 | A2
-------------------------------
Ant | Bee | Apple | G | H
A | B | C | H | s
CodePudding user response:
print (pd.merge(df1, df2, on='c3'))