Home > Enterprise >  Merge with multiple columns and refill NAN values in Python
Merge with multiple columns and refill NAN values in Python

Time:11-11

I have df1 that looks like this:

STATE      YEAR       EVENT_TYPE       DAMAGE
   ALABAMA    1962       Tornado          27
   ALABAMA    1962       Flood            7
   ALABAMA    1963       Thunderstorm     12
...

and df2 that looks like this:

STATE      YEAR       TORNADO      THUNDERSTORM      FLOOD
   ALABAMA     1962       NaN          NaN           NaN
   ALABAMA     1963       NaN          NaN           NaN
...

And I want to merge these two dataframes together, so the final output looks like this:

STATE      YEAR       TORNADO      THUNDERSTORM      FLOOD   
  ALABAMA    1962       27           NaN              7
...

Having hard time figuring out how to do this.

CodePudding user response:

Concat two dataframes.

pd.concat([df1, df2], axis=0)

Concat two dataframes and replace nan with 0, or whatever value you desire.

pd.concat([df1, df2], axis=0).df.fillna(0)

CodePudding user response:

merge the pivoted df1:

cols = ['STATE', 'YEAR']

out = df2[cols].merge(df1.pivot(index=cols, columns='EVENT_TYPE', values='DAMAGE'),
                      left_on=cols, right_index=True)

CodePudding user response:

You can use df1.update(df2) to avoid NaN positions (if possible):

import pandas as pd
import numpy  as np

NaN = np.nan

df1 = pd.DataFrame([[1,2  ],[3,NaN],[NaN,NaN]], columns = ('X','Y'))
df2 = pd.DataFrame([[1,NaN],[3,4  ],[NaN,6  ]], columns = ('X','Y'))

df1.update(df2)

print(df1)
# output
#    X    Y
# 0  1.0  2.0
# 1  3.0  4.0
# 2  NaN  6.0
  • Related