Home > Net >  "Group" rows based on one column, then create new columns for the possible combinations of
"Group" rows based on one column, then create new columns for the possible combinations of

Time:10-13

I have a dataframe of the following form:

date I1 I2 F1 F2
1 1 A v1 v9
1 1 B v2 v10
1 10 A v3 v11
1 10 B v4 v12
2 1 A v5 v13
2 1 B v6 v14
2 10 A v7 v15
2 10 B v8 v16

I want to create new columns based on the possible combinations of values from 'I1' and 'I2', and have the corresponding values all in one row (for the same date). The resulting dataframe should look like this (up to order of columns, which doesn't matter and I just chose one for visualization purposes):

date F1-1-A F1-1-B F1-10-A F1-10-B F2-1-A F2-1-B F2-10-A F2-10-B
1 v1 v2 v3 v4 v9 v10 v11 v12
2 v5 v6 v7 v8 v13 v14 v15 v16

What is the most concise and general way to achieve this in pandas?

CodePudding user response:

You're looking for a pivot. Once you do this, you will need to join the multi-index column names to flatten it out.

df = df.pivot(index='date',columns=['I1','I2'], values=['F1','F2'])
df.columns = ['-'.join(map(str,x)) for x in df.columns]
df.reset_index(inplace=True)

Output

   date F1-1-A F1-1-B F1-10-A F1-10-B F2-1-A F2-1-B F2-10-A F2-10-B
0     1     v1     v2      v3      v4     v9    v10     v11     v12
1     2     v5     v6      v7      v8    v13    v14     v15     v16

CodePudding user response:

@Chris' solution works great; a bit of an abstraction is possible with pivot_wider from pyjanitor, to flatten the columns (or do some other manipulation if needed):

#pip install git https://github.com/pyjanitor-devs/pyjanitor.git
import pandas as pd
import janitor
df.pivot_wider(index='date', names_from=['I1', 'I2'], names_sep='-')

   date F1-1-A F1-1-B F1-10-A F1-10-B F2-1-A F2-1-B F2-10-A F2-10-B
0     1     v1     v2      v3      v4     v9    v10     v11     v12
1     2     v5     v6      v7      v8    v13    v14     v15     v16

Note that pivot works only if the combination of index and columns is unique; if it isn't, a better option would be pivot_table.

  • Related