How to split a row at every nth column and stack it below in pandas-CodePudding

So I have this single column dataframe with sorted values

Gene Symbol
AAAS
ABCC8
ABCD4
ABL1
ACAA1
ACADM
ACADS
ACD
ACO2
ACOX1
ACP5
ACSL4
ACTB
ACTG1
ADA
ADA2
ADAM17
ADAMTS13
ADAMTS3
ADAR
ADCY1
ADCY2
ADCY3
ADCY4
ADCY5
ADCY6
ADCY7
ADCY8
ADCY9
ADIPOQ

and I want to rearrange this in 10 row table, like this:-

So far I have sued this below code :-

(pd.DataFrame(newDf.groupby(newDf.index // 10)['Gene Symbol'].apply(list).values.tolist()).fillna(''))

But it does not looks like above table:-

CodePudding user response：

Try this... if #rows are multiple of 10

np.array(df["Gene Symbol"]).reshape(10,int(df.shape[0]/10))

else, try this... where will first make dataframe multiple of 10 then split as required;

import pandas as pd
import numpy as np
df = pd.DataFrame({"Gene Symbol":range(1,36)})
lst = list(df["Gene Symbol"])   [np.nan]*(df.shape[0])
df1 = pd.DataFrame(np.array(lst).reshape(int(len(lst)/10),10))

# Output
      0     1     2     3     4     5     6     7     8     9
0   1.0   2.0   3.0   4.0   5.0   6.0   7.0   8.0   9.0  10.0
1  11.0  12.0  13.0  14.0  15.0  16.0  17.0  18.0  19.0  20.0
2  21.0  22.0  23.0  24.0  25.0  26.0  27.0  28.0  29.0  30.0
3  31.0  32.0  33.0  34.0  35.0   NaN   NaN   NaN   NaN   NaN

Hope this Helps...

CodePudding user response：

lets take a few steps using floor division and cumcount with .unstack

df = df.set_index(df.index // 10)

df1 = df.set_index(df.groupby(level=0).cumcount(),append=True).unstack(1)

print(df1)
  Gene Symbol
            0      1      2      3      4      5       6         7        8       9
0        AAAS  ABCC8  ABCD4   ABL1  ACAA1  ACADM   ACADS       ACD     ACO2   ACOX1
1        ACP5  ACSL4   ACTB  ACTG1    ADA   ADA2  ADAM17  ADAMTS13  ADAMTS3    ADAR
2       ADCY1  ADCY2  ADCY3  ADCY4  ADCY5  ADCY6   ADCY7     ADCY8    ADCY9  ADIPOQ