Home > Net >  Concat produces all NaN for a column of second dataframe
Concat produces all NaN for a column of second dataframe

Time:09-07

This is how I replaced the NaNs in the two dataframes,

df_max.fillna(0, inplace=True)
df_min.fillna(0, inplace=True)

This is how I concat and create a calculated column

df_max.reset_index(drop=True, inplace=True)
df_min.reset_index(drop=True, inplace=True)
df_combine = pd.concat([df_max, df_min], ignore_index = True)


df_combine['Range'] = df_combine['Maximum temperature (Degree C)'] - df_combine['Minimum temperature (Degree C)']
df_combine.head()

But I stil get NaN value for all rows in some columns. enter image description here

CodePudding user response:

I think you want:

df_combine = pd.concat([df_max, df_min], ignore_index = True, axis=1)

CodePudding user response:

Add axis=1 concatenate the dataframes by column and remove ignore_index=True to keep the column name.

df_combine = pd.concat([df_max, df_min], axis=1)

As there are repeated columns, pd.merge might work better, it works like sql join.

df_combine = pd.merge(df_max, df_min, on=['Year', 'Month', 'Product Code'], how='outer')

By default, axis=0 which means the dataframes are concatenated by row.

Example

df1 = pd.DataFrame([[1,2,3],[11,12,13]], columns=['a','b','c'])
df1 

    a   b   c
0   1   2   3
1  11  12  13 

df2 = pd.DataFrame([[1,2,6],[14,15,16]], columns=['a','b','f'])
df2
    a   b   f
0   1   2   6
1  14  15  16 
pd.concat([df1, df2], ignore_index = True)
    a   b     c     f
0   1   2   3.0   NaN
1  11  12  13.0   NaN
2   1   2   NaN   6.0
3  14  15   NaN  16.0

pd.concat([df1, df2], axis=1, ignore_index = True)
    0   1   2   3   4   5
0   1   2   3   1   2   6
1  11  12  13  14  15  16

pd.concat([df1, df2], axis=1)
    a   b   c   a   b   f
0   1   2   3   1   2   6
1  11  12  13  14  15  16

pd.merge(df1, df2, on=['a','b'], how='outer')
    a   b     c     f
0   1   2   3.0   6.0
1  11  12  13.0   NaN
2  14  15   NaN  16.0

  • Related