Home > Blockchain >  Multilevel concatenation grouped by a common index (pandas)
Multilevel concatenation grouped by a common index (pandas)

Time:10-24

I'm trying to combine multiple dataframes into one. I need the combined dataframe to have multilevel headers, as in the example below. I also need the dataframe to be grouped on multiple index columns.

I've seen answers for some part of the above but not combined, and I can't figure it out.

I'd really appreciate some help!

edit: To clarify, I need the results to be grouped by index, when the indexes are not all listed in the same order.

'A'

                                   'a' 'b'  
'index4.1'  'index4.2'  'index4.3'  5   7
'index5.1'  'index5.2'  'index5.3'  3   8       
'index1.1'  'index1.2'  'index1.3'  9   8
'index2.1'  'index2.2'  'index2.3'  6   3
'index3.1'  'index3.2'  'index3.3'  8   4
'index4.1'  'index4.2'  'index4.3'  5   7

'B'

                                   'a' 'b' 
'index3.1'  'index3.2'  'index3.3'  1   5       
'index1.1'  'index1.2'  'index1.3'  6   2
'index2.1'  'index2.2'  'index2.3'  4   7
'index4.1'  'index4.2'  'index4.3'  4   7
'index5.1'  'index5.2'  'index5.3'  3   6 

'C'

                                   'b' 'c'        
'index1.1'  'index1.2'  'index1.3'  3   0
'index3.1'  'index3.2'  'index3.3'  1   5
'index4.1'  'index4.2'  'index4.3'  6   7
'index5.1'  'index5.2'  'index5.3'  3   1 
'index2.1'  'index2.2'  'index2.3'  2   1

Combined dataframe:

                                   'A' 'A' 'B' 'B' 'C' 'C'
                                   'a' 'b' 'a' 'b' 'b' 'c'      
'index1.1'  'index1.2'  'index1.3'  9   8   6   2   3   0
'index2.1'  'index2.2'  'index2.3'  6   3   4   7   2   1
'index3.1'  'index3.2'  'index3.3'  8   4   1   5   1   5
'index4.1'  'index4.2'  'index4.3'  5   7   4   7   6   7
'index5.1'  'index5.2'  'index5.3'  3   8   3   6   3   1

CodePudding user response:

Use concat with keys parameter:

df = pd.concat([A, B, C], axis=1, keys=('A','B','C'))
print (df)
                                   A       B       C    
                                 'a' 'b' 'a' 'b' 'b' 'c'
'index1.1' 'index1.2' 'index1.3'   9   8   6   2   3   0
'index2.1' 'index2.2' 'index2.3'   6   3   4   7   2   1
'index3.1' 'index3.2' 'index3.3'   8   4   1   5   1   5
'index4.1' 'index4.2' 'index4.3'   5   7   4   7   6   7
'index5.1' 'index5.2' 'index5.3'   3   8   3   6   3   1

If need display first levels:

with pd.option_context("display.multi_sparse", False):
    print (df)
                                   A   A   B   B   C   C
                                 'a' 'b' 'a' 'b' 'b' 'c'
'index1.1' 'index1.2' 'index1.3'   9   8   6   2   3   0
'index2.1' 'index2.2' 'index2.3'   6   3   4   7   2   1
'index3.1' 'index3.2' 'index3.3'   8   4   1   5   1   5
'index4.1' 'index4.2' 'index4.3'   5   7   4   7   6   7
'index5.1' 'index5.2' 'index5.3'   3   8   3   6   3   1
  • Related