Home > Blockchain >  how to convert multiple rows into single column for a time series data?
how to convert multiple rows into single column for a time series data?

Time:08-03

I have time series data or and one group has multiple rows. So, I need to transform that multiple rows into a single row for a group in python. I have this dummy dataset.

data = [
  ['A','test1',1,2,3,4],
  ['A','test1',5,6,7,8],
  ['A','test1',9,10,11,12],
  ['B','test2',13,14,15,16],
  ['B','test2',17,18,19,20],
  ['B','test2',21,22,23,24],
  ['C','test3',29,30,31,32],
  ['C','test3',33,34,35,36],
  ['C','test3',37,38,39,40],

]

df = pd.DataFrame(data, columns=['Name', 'cate','num1', 'num2', 'num3', 'num4'])
df

And I want to change to this format in Python. Can somebody help me to change multiple rows into single rows? enter image description here

CodePudding user response:

You can do pivot after cumcount

s = df.assign(key = df.groupby('Name').cumcount() 1).pivot(index = ['Name','cate'],columns = 'key')
s.columns = s.columns.map('{0[0]}{0[1]}'.format)
s.reset_index(inplace=True)
s
Out[266]: 
  Name   cate  num11  num12  num13  ...  num32  num33  num41  num42  num43
0    A  test1      1      5      9  ...      7     11      4      8     12
1    B  test2     13     17     21  ...     19     23     16     20     24
2    C  test3     29     33     37  ...     35     39     32     36     40

CodePudding user response:

out = df.groupby(['Name', 'cate']).agg(list).sum(axis=1).apply(pd.Series)
print(out)

Output:

            0   1   2   3   4   5   6   7   8   9   10  11
Name cate
A    test1   1   5   9   2   6  10   3   7  11   4   8  12
B    test2  13  17  21  14  18  22  15  19  23  16  20  24
C    test3  29  33  37  30  34  38  31  35  39  32  36  40
  • Related