I have 2 identically formatted Dataframes (DF1 and DF2). DF1 is a search from the last 30 days, and DF2 is today only. I want to compare the two and only produce a new data frame (DF3) with the Data from DF2 that isnt on DF1. Everything tried so far either merges or concats the tables and i'm left with a table of ALL unique values. Any thoughts ?
CodePudding user response:
You do not provide details of your data nor what to do with the duplicated values. But in principle comprehension could be used as below; this will replace duplicated values with NaN.
df3 =df2[df2 != df1]
CodePudding user response:
There is not too much info provided but from what I can tell, I have attempted something like this:
import pandas as pd
data1 = {'A':[1,2,3], 'B':[4,5,6] }
df1 = pd.DataFrame(data1)
df1
Output:
A B
0 1 4
1 2 5
2 3 6
Create the 2nd dataframe here:
data2 = {'A':[1,8,9], 'B':[4,10,12] }
df2 = pd.DataFrame(data2)
df2
Output:
A B
0 1 4
1 8 10
2 9 12
Create the 3rd dataframe here whereby you compare all info in the df2 and where the info is not the same as in df1 it is not shown as number:
df3 = df2[df2.isin(df1)]
df3
Output:
A B
0 1.0 4.0
1 NaN NaN
2 NaN NaN
Perhaps with more detail on the df's it would be easier to understand what the request is.