Home > Blockchain >  Merging DataFrames returning missing values in Pandas
Merging DataFrames returning missing values in Pandas

Time:03-10

I have these two dataframes that I want to merge:

df1 = pd.DataFrame({'platform': ['android', 'android','android','android','ios','ios','ios','ios'],
                    'day': [3, 7, 14, 30,3, 7, 14, 30],
                    'value_m' : [1.2, 1.3, 1.7, 1.8,1.6, 2.3, 3.7, 1.8,]})

df2 = pd.DataFrame({'platform': ['android','ios','ios','android','android','android','ios','ios'],
                    'day': [3, 7, 14, 30, 3, 7, 14, 30],
                    'value_x' : [4, 6, 8, 9,4,6,7,8]})

I use the columns platform and day to create a new dataframe that includes the column 'value_x' on my df1. I have tried it with this code:

df_pred = df1.merge(df2, left_on=["platform","day"], right_on=["platform","day"], how="left")
df_pred

This is what I get:

enter image description here

I don't understand why it is full of NaNs after using platform and day to pull the data to the new dataframe. Any clue of why this is happening?

Thanks!

CodePudding user response:

Problem in real data is because day is in one DataFrame string and in another number.

Try in real data convert values to same type:

df1['day'] = df1['day'].astype(int)
df2['day'] = df2['day'].astype(int)

Your sample data working well.

df_pred = df1.merge(df2, on=["platform","day"], how="left")
print (df_pred)
  platform  day  value_m  value_x
0  android    3      1.2      4.0
1  android    3      1.2      4.0
2  android    7      1.3      6.0
3  android   14      1.7      NaN
4  android   30      1.8      9.0
5      ios    3      1.6      NaN
6      ios    7      2.3      6.0
7      ios   14      3.7      8.0
8      ios   14      3.7      7.0
9      ios   30      1.8      8.0
  • Related