Home > OS >  Horizontal split of Pandas DataFrame to append the content as a column via Numpy
Horizontal split of Pandas DataFrame to append the content as a column via Numpy

Time:05-19

This is a beginner question. I want to split a pandas df horizontally all n entries and then insert the split data each as a separate column. E.g. n=2:

a b      to        a b c d  e  f
1 11               1 3 5 11 13 15
2 12               2 4 6 12 14 16
3 13
4 14
5 15
6 16

Since I have not get it directly with pandas I try it now with numpy and would like to convert it afterwards again into a pd.df.

import numpy as np
import pandas as pd

data = [1,2,3,4,5,6],[11,12,13,14,15,16]
df =pd.DataFrame(data=data)
df= df.T

df_np = pd.DataFrame(df).to_numpy()

df_np_new = np.array_split(df_np,3, axis=0)
df_np_new2 = np.array(df_np_new)

Now the output is a 3D array (3,2,2) and I can't manage to get the array into the above structure with the reshape function to convert it back to a df.

Do you have any tips on how to do this or is the approach itself not good to achieve this goal? Thanks a lot!

CodePudding user response:

This should work

data = [1,2,3,4,5,6],[11,12,13,14,15,16]
df =pd.DataFrame(data=data)
df = df.T
# reshape and transpose the underlying numpy array of df, shape (2, 2, 3)
# horizontally stack 3d array for 2d array
pd.DataFrame(np.hstack(df.values.reshape(-1,2,2).T), columns=[*'abcdef'])
   a  b  c   d   e   f
0  1  3  5  11  13  15
1  2  4  6  12  14  16
  • Related