Home > OS >  Replace and add rows to an old dataframe based on new dataframe and index using pandas
Replace and add rows to an old dataframe based on new dataframe and index using pandas

Time:02-16

I need to merge two dataframes replacing some rows based on string indexes and adding the new rows (new indexes). I'm curious about some kind of standard or efficient way to do it.

# Example: "A" and "B" are columns and "idx" is the index
# Input
# df1

idx  A  B  
a   1  1
b   2  2
c   3  3


# df2
idx  A   B      
a   -1  -1
b   -2  -2
d  -10 -10
f  -20 -20

# process
merged_df = replace_and_add(old_df=df_1, new_df=df_2)

# Output
# new_df      
idx  A   B      
a   -1  -1
b   -2  -2
c    3   3
d  -10 -10
f  -20 -20

CodePudding user response:

You're looking for a concatenation of dataframes, without replication (union distinct). In pandas, you could:

df = pd.concat( [old_df, new_df] ).drop_duplicates()

CodePudding user response:

You can reindex your first dataframe with the second one then update values. It's an efficient way to avoid to create a temporary dataframe:

df1 = df1.reindex(df1.index.union(df2.index))
df1.update(df2)
print(df1)

# Output:
        A     B
idx            
a    -1.0  -1.0
b    -2.0  -2.0
c     3.0   3.0
d   -10.0 -10.0
f   -20.0 -20.0

The downside is the dtype which is now float64 instead of int64.

  • Related