Home > Blockchain >  DataFrame with shared row indexes and shared column headers
DataFrame with shared row indexes and shared column headers

Time:03-19

I'd like to make a dataframe (?) which visually looks like the following:

enter image description here

Is it possible using pandas? If so, can anyone suggest me related to that. If not, is that even possible to do in python? If possible, can anyone suggest me anything related to that?

CodePudding user response:

That exact layout is not possible in pandas because we need some way to programmatically access the values. Looking at your current table, how would we index the cell containing 2? It doesn't have a distinct index.

We can technically use MultiIndexes to wrap multiple rows/cols, but every individual row/col still needs a separate label, e.g.:

df = pd.DataFrame([['a', 1], ['b', 2], ['a', 3], ['b', 4]])
df.index = pd.MultiIndex.from_product([['Col1'], ['X', 'Y'], [0, 1]])
df.columns = pd.MultiIndex.from_product([['Col2'], [0, 1]])
#              | Col2  |
#              |-------|
#              | 0 | 1 |
# -------------|---|---|
# Col1 | X | 0 | a | 1 |
#      |   |---|---|---|
#      |   | 1 | b | 2 |
#      |---|---|---|---|
#      | Y | 0 | a | 3 |
#      |   |---|---|---|
#      |   | 1 | b | 4 |

But this setup makes it really annoying to access values. For example, here we take a row cross-section at column 1 to access the cell containing 2:

df.xs(('Col1', 'X', 1))[1]
# 2

This is why we normally just use standard wide or long tables with simple row/col indexes, e.g.:

df = pd.DataFrame({'Col1': list('XXYY'), 'Col2_0': list('abab'), 'Col2_1': [1, 2, 3, 4]})
#    Col1  Col2_0  Col2_1
# 0     X       a       1
# 1     X       b       2
# 2     Y       a       3
# 3     Y       b       4

df.loc[1, 'Col2_1']
# 2
  • Related