I wish to transform column headers to values with their counts in Python (counts are the value numbers under each column header per unique date and id)
Data
id date dd aa
hey Q1 23 2 1
hey Q2 23 2 1
ok Q3 23 2 1
ok Q4 23 2 3
Desired
id date type
hey Q1 23 dd
hey Q1 23 dd
hey Q1 23 aa
hey Q2 23 dd
hey Q2 23 dd
hey Q2 23 aa
ok Q3 23 dd
ok Q3 23 dd
ok Q3 23 aa
ok Q4 23 dd
ok Q4 23 dd
ok Q4 23 aa
ok Q4 23 aa
ok Q4 23 aa
Doing
df.pivot(index="id", columns="Date", values=["dd"])
Any suggestion is appreciated
CodePudding user response:
You can set_index
stack
to get a Series of the repeats, then use Series.repeat
which repeats everything in the Index (which is all the information you need after the stack). The .drop
removes the column which indicated how many repeats were necessary.
s = df.set_index(['id', 'date']).rename_axis(columns='type').stack()
df1 = s.repeat(s).reset_index().drop(0, axis=1)
id date type
0 hey Q1 23 dd
1 hey Q1 23 dd
2 hey Q1 23 aa
3 hey Q2 23 dd
4 hey Q2 23 dd
5 hey Q2 23 aa
6 ok Q3 23 dd
7 ok Q3 23 dd
8 ok Q3 23 aa
9 ok Q4 23 dd
10 ok Q4 23 dd
11 ok Q4 23 aa
12 ok Q4 23 aa
13 ok Q4 23 aa
CodePudding user response:
Use melt
:
out = df.melt(['id', 'date'], var_name='type')
out = out.reindex(out.index.repeat(out.pop('value'))).reset_index(drop=True)
Output:
>>> out
id date type
0 hey Q1 23 dd
1 hey Q1 23 dd
2 hey Q2 23 dd
3 hey Q2 23 dd
4 ok Q3 23 dd
5 ok Q3 23 dd
6 ok Q4 23 dd
7 ok Q4 23 dd
8 hey Q1 23 aa
9 hey Q2 23 aa
10 ok Q3 23 aa
11 ok Q4 23 aa
12 ok Q4 23 aa
13 ok Q4 23 aa