I have a dataset which looks like this
A | B | C |
---|---|---|
A1 | 1 | 1.3 |
A1 | 2 | 1.4 |
A1 | 3 | 1.3 |
A2 | 1 | 1.7 |
A2 | 2 | 1.6 |
A2 | 3 | 1.5 |
A1 | 1 | 1 |
A1 | 2 | 2.7 |
A1 | 3 | 2.4 |
A2 | 1 | 1.6 |
A2 | 2 | 1.7 |
A2 | 3 | 1.2 |
Desired output :
A | 1 | 2 | 3 |
---|---|---|---|
A1 | 1.3 | 1.4 | 1.3 |
A1 | 1 | 2.7 | 2.4 |
A2 | 1.7 | 1.6 | 1.5 |
A2 | 1.6 | 1.7 | 1.2 |
I have tried a pivot table but it's giving aggregation and without aggregation pivot it is showing a duplicate index error because of the duplicate index. What is the best way to handle this?
CodePudding user response:
You can iterate over using number of elements (assuming all combos have same number of elements) as follows:
dfx = df.pivot_table(index='A', columns='B', values='C', aggfunc=list)
pd.concat([dfx.applymap(lambda x:x[idx]) for idx in range(dfx.apply(lambda x: len(x)).max())])
A | 1 | 2 | 3 |
---|---|---|---|
A1 | 1.3 | 1.4 | 1.3 |
A2 | 1.7 | 1.6 | 1.5 |
A1 | 1 | 2.7 | 2.4 |
A2 | 1.6 | 1.7 | 1.2 |
Old answer:
Seeing your sample dataframe, a simple pivot should do:
df.pivot_table(index='A', columns='B', values='C', aggfunc='mean')
CodePudding user response:
Try:
df = df.pivot_table(index="A", columns="B", values="C", aggfunc=list)
df = df.explode(df.columns.to_list()).reset_index()
df.columns.name = None
print(df)
Prints:
A 1 2 3
0 A1 1.3 1.4 1.3
1 A1 1.0 2.7 2.4
2 A2 1.7 1.6 1.5
3 A2 1.6 1.7 1.2
CodePudding user response:
One option is to create an extra column that introduces uniqueness to your data, before pivoting:
a = df.groupby('A').ngroup().diff().fillna(0, downcast = 'infer')
a = a.mask(a < 0, 1).cumsum()
(df
.assign(temp = a)
.pivot(['temp', 'A'], 'B', 'C')
.droplevel(0)
.rename_axis(columns = None)
)
1 2 3
A
A1 1.3 1.4 1.3
A2 1.7 1.6 1.5
A1 1.0 2.7 2.4
A2 1.6 1.7 1.2