Home > Back-end >  Return only the differences from 2 dataframe tables
Return only the differences from 2 dataframe tables

Time:07-10

I have 2 identically formatted Dataframes (DF1 and DF2). DF1 is a search from the last 30 days, and DF2 is today only. I want to compare the two and only produce a new data frame (DF3) with the Data from DF2 that isnt on DF1. Everything tried so far either merges or concats the tables and i'm left with a table of ALL unique values. Any thoughts ?

CodePudding user response:

You do not provide details of your data nor what to do with the duplicated values. But in principle comprehension could be used as below; this will replace duplicated values with NaN.

df3 =df2[df2 != df1]

CodePudding user response:

There is not too much info provided but from what I can tell, I have attempted something like this:

import pandas as pd
data1 = {'A':[1,2,3], 'B':[4,5,6] }
df1 = pd.DataFrame(data1)
df1

Output:

    A   B
0   1   4
1   2   5
2   3   6

Create the 2nd dataframe here:

data2 = {'A':[1,8,9], 'B':[4,10,12] }
df2 = pd.DataFrame(data2)
df2

Output:

    A   B
0   1   4
1   8   10
2   9   12

Create the 3rd dataframe here whereby you compare all info in the df2 and where the info is not the same as in df1 it is not shown as number:

df3 = df2[df2.isin(df1)]
df3

Output:

    A   B
0   1.0 4.0
1   NaN NaN
2   NaN NaN

Perhaps with more detail on the df's it would be easier to understand what the request is.

  • Related