I have a Dataframe that is composed by 3760 rows. I want to split it in 10 parts of equal lenght and then use each new array as a column for a new DataFrame.
A way that I found to do this is:
alfa = np.array_split(dff, 10)
caa = pd.concat([alfa[0].reset_index(drop=True), alfa[1].reset_index(drop=True), alfa[2].reset_index(drop=True), alfa[3].reset_index(drop=True),
alfa[4].reset_index(drop=True), alfa[5].reset_index(drop=True), alfa[6].reset_index(drop=True), alfa[7].reset_index(drop=True),
alfa[8].reset_index(drop=True), alfa[9].reset_index(drop=True)], axis=1)
Not very cool, not very efficient.
Then I tried
teta = pd.concat(np.array_split(dff, 10), axis=1, ignore_index=True)
But it doesn't work as I wanted since it gives me this:
I assume that is because the ignore_index
works on the axis 1
Is there a better way to do it?
CodePudding user response:
You can use list comprehension to concat your columns. This code expects your columns name is init_col
:
chunks = 10
cols = ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J']
out = pd.concat(
[np.array_split(dff, chunks)[i]
.reset_index(drop=True)
.rename(columns={"init_col": cols[i]})
for i in range(chunks)],
axis=1
)
CodePudding user response:
It seems the original DataFrame seems to be just an array? In that case, perhaps you could use numpy.reshape
:
new_df = pd.DataFrame(dff.to_numpy().reshape(10,-1).T, columns=dff.columns.tolist()*10)