Home > Blockchain >  Transform column headers to values and respective counts in Python
Transform column headers to values and respective counts in Python

Time:11-16

I wish to transform column headers to values with their counts in Python (counts are the value numbers under each column header per unique date and id)

Data

id  date    dd  aa
hey Q1 23   2   1
hey Q2 23   2   1
ok  Q3 23   2   1
ok  Q4 23   2   3
            
        
        

Desired

id  date    type    
hey Q1 23   dd  
hey Q1 23   dd  
hey Q1 23   aa  
hey Q2 23   dd  
hey Q2 23   dd  
hey Q2 23   aa  
ok  Q3 23   dd  
ok  Q3 23   dd  
ok  Q3 23   aa  
ok  Q4 23   dd  
ok  Q4 23   dd  
ok  Q4 23   aa  
ok  Q4 23   aa  
ok  Q4 23   aa  

Doing

df.pivot(index="id", columns="Date", values=["dd"])

Any suggestion is appreciated

CodePudding user response:

You can set_index stack to get a Series of the repeats, then use Series.repeat which repeats everything in the Index (which is all the information you need after the stack). The .drop removes the column which indicated how many repeats were necessary.

s = df.set_index(['id', 'date']).rename_axis(columns='type').stack()
df1 = s.repeat(s).reset_index().drop(0, axis=1)

     id  date  type
0   hey  Q1 23   dd
1   hey  Q1 23   dd
2   hey  Q1 23   aa
3   hey  Q2 23   dd
4   hey  Q2 23   dd
5   hey  Q2 23   aa
6    ok  Q3 23   dd
7    ok  Q3 23   dd
8    ok  Q3 23   aa
9    ok  Q4 23   dd
10   ok  Q4 23   dd
11   ok  Q4 23   aa
12   ok  Q4 23   aa
13   ok  Q4 23   aa

CodePudding user response:

Use melt:

out = df.melt(['id', 'date'], var_name='type')
out = out.reindex(out.index.repeat(out.pop('value'))).reset_index(drop=True)

Output:

>>> out
     id   date type
0   hey  Q1 23   dd
1   hey  Q1 23   dd
2   hey  Q2 23   dd
3   hey  Q2 23   dd
4    ok  Q3 23   dd
5    ok  Q3 23   dd
6    ok  Q4 23   dd
7    ok  Q4 23   dd
8   hey  Q1 23   aa
9   hey  Q2 23   aa
10   ok  Q3 23   aa
11   ok  Q4 23   aa
12   ok  Q4 23   aa
13   ok  Q4 23   aa
  • Related