Home > Back-end >  pandas fill in missing index from other dataframe
pandas fill in missing index from other dataframe

Time:07-12

I wanted to know if there is a way for me to merge / re-join the missing rows simply by index.

My original way to approach is just to cleanly separate df1 into df1_cleaned and df1_untouched, and then join them back together. But I thought there's probably an easier way to re-join the two df2 since I didn't change the index. I tried outer merge with left_index and right_index but was left with the dupe columns with suffix to clean.

df1

index colA colB colC
0 California 123 abc
1 New York 456 def
2 Texas 789 ghi

df2 (subset of df1 and cleaned)

index colA colB colC
0 California 321 abc
2 Texas 789 ihg

end-result

index colA colB colC
0 California 321 abc
1 New York 456 def
2 Texas 789 ihg

CodePudding user response:

You can use combine_first or update:

df_out = df2.combine_first(df1)

or, pd.DataFrame.update (which is an inplace operation and will overwrite df1):

df1.update(df2)

Output:

             colA   colB colC
index                        
0      California  321.0  abc
1        New York  456.0  def
2           Texas  789.0  ihg

CodePudding user response:

You can get difference of index, and add the missing index from df1 to df_result after reindexing df2

df_result = df2.reindex(df1.index)
missing_index = df1.index.difference(df2.index)
df_result.loc[missing_index] = df1.loc[missing_index]

print(df_result)

         colA   colB colC
0  California  321.0  abc
1    New York  456.0  def
2       Texas  789.0  ihg
  • Related