Home > Mobile >  Mergin dataframes on different index levels
Mergin dataframes on different index levels

Time:11-26

I have 2 dfs, df_1:

                            col_1       col_2
index_1     index_2
abc         a_1             10          20
            a_2             15          15
            c_2             48          74
def         a_1             10          15
            b_2             1           1

And df_2:

id          value
abc         value_abc
def         value_def
a_1         value_a_1
a_2         value_a_2
b_2         value_b_2
c_2         value_c_2

I am trying to merge the values to the indexes like so:

                            col_1       col_2       value_index_1           value_index_2
index_1     index_2
abc         a_1             10          20          value_abc               value_a_1
            a_2             15          15                                  value_a_2
            c_2             48          74                                  value_c_2
def         a_1             10          15          value_def               value_a_1
            b_2             1           1                                   value_c_2

So both index_1 and index_2 is a id field in df_2 and I am trying to add the value columns as 2 different columns in the original df_1.

I tried:

df_1.merge(df_2, left_on = 'index_1', right_on ='id', how = 'left', left_index=True)

But this removes the visibility of index_2 for some reason. Where is my mistake?

CodePudding user response:

Use rename by Series from df_2, convert to DataFrame, reaasign default index and append to original:

df = df_1.join(df_1.rename(df_2['value']).index.to_frame().set_index(df_1.index))
print (df)
                 col_1  col_2    index_1    index_2
index_1 index_2                                    
abc     a_1         10     20  value_abc  value_a_1
        a_2         15     15  value_abc  value_a_2
        c_2         48     74  value_abc  value_c_2
def     a_1         10     15  value_def  value_a_1
        b_2          1      1  value_def  value_b_2

Reason why values are repeated is possible find in docs:

We’ve “sparsified” the higher levels of the indexes to make the console output a bit easier on the eyes. Note that how the index is displayed can be controlled using the multi_sparse option in pandas.set_options():

with pd.option_context("display.multi_sparse", False):
    df
  • Related