Home > Net >  DataFrame Create Multi-Index for column with list values
DataFrame Create Multi-Index for column with list values

Time:08-18

I have a DataFrame that looks like this:

 ----- ------ -------- 
| idx | Col1 |  Col2  |
 ----- ------ -------- 
|   0 | A    | [1, 2] |
|   1 | B    | [3, 4] |
|   2 | C    | [5, 6] |
 ----- ------ -------- 

What I would like to accomplish is a new column layout like this:

 ----- ------ ------------- 
| idx | Col1 |    Col2     |
 ----- ------ ------ ------ 
|     |      | sub1 | sub2 |
 ----- ------ ------ ------ 
|   0 | A    |   1  |   2  |
|   1 | B    |   3  |   4  |
|   2 | C    |   5  |   6  |
 ----- ------ ------ ------ 

The end goal is to be able to do a df.query() like the following:

df.query("Col2.sub1 == 3 & Col2.sub2 == 4")

to get the row at index 1.

Is this even possible with df.query()?

Edit This is what produces the first table.

records = [{'Col1': 'A', 'Col2': [1, 2]},{'Col1': 'B', 'Col2': [3,4]},{'Col1': 'C', 'Col2': [5,6]}]
df = pd.DataFrame.from_records(records)

CodePudding user response:

Firstly, split lists into columns:

df[['sub1', 'sub2']] = pd.DataFrame(df['Col2'].tolist(), index=df.index)
df = df.drop(columns='Col2')

  Col1  sub1  sub2
0    A     1     2
1    B     3     4
2    C     5     6

Create a Multiindex:

df.columns = pd.MultiIndex.from_arrays([['0', 'Col2', 'Col2'], 
                                       df.columns.tolist()])
  0     Col2     
  Col1  sub1  sub2
0    A     1     2
1    B     3     4
2    C     5     6

Now, here is how you can query the Multiindex:

df.query("`('Col2', 'sub1')` == 3 & `('Col2', 'sub2')` == 4")

  0     Col2     
  Col1  sub1  sub2
1    B     3     4
  • Related