Home > front end >  Merge two dataframe based on condition [duplicate]
Merge two dataframe based on condition [duplicate]

Time:09-17

I'm trying to merge two dataframes conditionally.

In df1, it has duration. In df2, it has usageTime. On df3, I want to set totalTime as df1's duration value if df2 has no usageTime value.

Here is df1:

>> df1
            duration
device
1110100         53.8
1110101         64.7
1110102         52.6
1110103         14.4

And df2:

>> df2
           usageTime
deviceId
1110100         87.6
1110101         94.3
1110102         None
1110103         None

The next dataframe I want to create is:

>> df3
          totalUsage
device
1110100         87.6
1110101         94.3
1110102         52.6
1110103         14.4

Things I tried:

  1. pandas.DataFrame.combine_first()

    df3 = df2.combine_first(df1.rename(columns={'duration': 'totalUsage'}))
    

    Returns:

            totalUsage  usageTime
    device
    1110100       53.8       87.6
    1110101       64.7       94.3
    1110102       52.6       None
    1110103       14.3       None
    
  2. pandas.DataFrame.fillna()

    df3 = df2.fillna(df1)
    df3.columns = ['totalUsage']
    

    Returns:

              totalUsage
    device
    1110100         87.6
    1110101         94.3
    1110102          NaN
    1110103          NaN
    

I am open to all ideas.

CodePudding user response:

Specify the column names when using fillna and then convert the result to_frame:

df3 = df2.usageTime.fillna(df1.duration).to_frame(name='totalUsage')

#          totalUsage
# deviceId           
# 1110100        87.6
# 1110101        94.3
# 1110102        52.6
# 1110103        14.4
  • Related