This is a beginner question. I want to split a pandas df horizontally all n entries and then insert the split data each as a separate column. E.g. n=2:
a b to a b c d e f
1 11 1 3 5 11 13 15
2 12 2 4 6 12 14 16
3 13
4 14
5 15
6 16
Since I have not get it directly with pandas I try it now with numpy and would like to convert it afterwards again into a pd.df.
import numpy as np
import pandas as pd
data = [1,2,3,4,5,6],[11,12,13,14,15,16]
df =pd.DataFrame(data=data)
df= df.T
df_np = pd.DataFrame(df).to_numpy()
df_np_new = np.array_split(df_np,3, axis=0)
df_np_new2 = np.array(df_np_new)
Now the output is a 3D array (3,2,2) and I can't manage to get the array into the above structure with the reshape function to convert it back to a df.
Do you have any tips on how to do this or is the approach itself not good to achieve this goal? Thanks a lot!
CodePudding user response:
This should work
data = [1,2,3,4,5,6],[11,12,13,14,15,16]
df =pd.DataFrame(data=data)
df = df.T
# reshape and transpose the underlying numpy array of df, shape (2, 2, 3)
# horizontally stack 3d array for 2d array
pd.DataFrame(np.hstack(df.values.reshape(-1,2,2).T), columns=[*'abcdef'])
a b c d e f
0 1 3 5 11 13 15
1 2 4 6 12 14 16