I have a dataframe of the following form:
date | I1 | I2 | F1 | F2 |
---|---|---|---|---|
1 | 1 | A | v1 | v9 |
1 | 1 | B | v2 | v10 |
1 | 10 | A | v3 | v11 |
1 | 10 | B | v4 | v12 |
2 | 1 | A | v5 | v13 |
2 | 1 | B | v6 | v14 |
2 | 10 | A | v7 | v15 |
2 | 10 | B | v8 | v16 |
I want to create new columns based on the possible combinations of values from 'I1' and 'I2', and have the corresponding values all in one row (for the same date). The resulting dataframe should look like this (up to order of columns, which doesn't matter and I just chose one for visualization purposes):
date | F1-1-A | F1-1-B | F1-10-A | F1-10-B | F2-1-A | F2-1-B | F2-10-A | F2-10-B |
---|---|---|---|---|---|---|---|---|
1 | v1 | v2 | v3 | v4 | v9 | v10 | v11 | v12 |
2 | v5 | v6 | v7 | v8 | v13 | v14 | v15 | v16 |
What is the most concise and general way to achieve this in pandas?
CodePudding user response:
You're looking for a pivot. Once you do this, you will need to join the multi-index column names to flatten it out.
df = df.pivot(index='date',columns=['I1','I2'], values=['F1','F2'])
df.columns = ['-'.join(map(str,x)) for x in df.columns]
df.reset_index(inplace=True)
Output
date F1-1-A F1-1-B F1-10-A F1-10-B F2-1-A F2-1-B F2-10-A F2-10-B
0 1 v1 v2 v3 v4 v9 v10 v11 v12
1 2 v5 v6 v7 v8 v13 v14 v15 v16
CodePudding user response:
@Chris' solution works great; a bit of an abstraction is possible with pivot_wider
from pyjanitor
, to flatten the columns (or do some other manipulation if needed):
#pip install git https://github.com/pyjanitor-devs/pyjanitor.git
import pandas as pd
import janitor
df.pivot_wider(index='date', names_from=['I1', 'I2'], names_sep='-')
date F1-1-A F1-1-B F1-10-A F1-10-B F2-1-A F2-1-B F2-10-A F2-10-B
0 1 v1 v2 v3 v4 v9 v10 v11 v12
1 2 v5 v6 v7 v8 v13 v14 v15 v16
Note that pivot works only if the combination of index and columns is unique; if it isn't, a better option would be pivot_table
.