pandas fill in missing index from other dataframe-CodePudding

I wanted to know if there is a way for me to merge / re-join the missing rows simply by index.

My original way to approach is just to cleanly separate df1 into df1_cleaned and df1_untouched, and then join them back together. But I thought there's probably an easier way to re-join the two df2 since I didn't change the index. I tried outer merge with left_index and right_index but was left with the dupe columns with suffix to clean.

df1

index	colA	colB	colC
0	California	123	abc
1	New York	456	def
2	Texas	789	ghi

df2 (subset of df1 and cleaned)

index	colA	colB	colC
0	California	321	abc
2	Texas	789	ihg

end-result

index	colA	colB	colC
0	California	321	abc
1	New York	456	def
2	Texas	789	ihg

CodePudding user response：

You can use combine_first or update:

df_out = df2.combine_first(df1)

or, pd.DataFrame.update (which is an inplace operation and will overwrite df1):

df1.update(df2)

Output:

             colA   colB colC
index                        
0      California  321.0  abc
1        New York  456.0  def
2           Texas  789.0  ihg

CodePudding user response：

You can get difference of index, and add the missing index from df1 to df_result after reindexing df2

df_result = df2.reindex(df1.index)
missing_index = df1.index.difference(df2.index)
df_result.loc[missing_index] = df1.loc[missing_index]

print(df_result)

         colA   colB colC
0  California  321.0  abc
1    New York  456.0  def
2       Texas  789.0  ihg