I have time series data or and one group has multiple rows. So, I need to transform that multiple rows into a single row for a group in python. I have this dummy dataset.
data = [
['A','test1',1,2,3,4],
['A','test1',5,6,7,8],
['A','test1',9,10,11,12],
['B','test2',13,14,15,16],
['B','test2',17,18,19,20],
['B','test2',21,22,23,24],
['C','test3',29,30,31,32],
['C','test3',33,34,35,36],
['C','test3',37,38,39,40],
]
df = pd.DataFrame(data, columns=['Name', 'cate','num1', 'num2', 'num3', 'num4'])
df
And I want to change to this format in Python. Can somebody help me to change multiple rows into single rows?
CodePudding user response:
You can do pivot
after cumcount
s = df.assign(key = df.groupby('Name').cumcount() 1).pivot(index = ['Name','cate'],columns = 'key')
s.columns = s.columns.map('{0[0]}{0[1]}'.format)
s.reset_index(inplace=True)
s
Out[266]:
Name cate num11 num12 num13 ... num32 num33 num41 num42 num43
0 A test1 1 5 9 ... 7 11 4 8 12
1 B test2 13 17 21 ... 19 23 16 20 24
2 C test3 29 33 37 ... 35 39 32 36 40
CodePudding user response:
out = df.groupby(['Name', 'cate']).agg(list).sum(axis=1).apply(pd.Series)
print(out)
Output:
0 1 2 3 4 5 6 7 8 9 10 11
Name cate
A test1 1 5 9 2 6 10 3 7 11 4 8 12
B test2 13 17 21 14 18 22 15 19 23 16 20 24
C test3 29 33 37 30 34 38 31 35 39 32 36 40