I have a pd dataframe consisting of hundreds of columns. I want to contenate all the columns into a 1D array. For instance, supppose the dataframe is something like this:
pd df:
set1 set2 set3 ... set175
3 5 6 ... 9
4 8 0 ... 22
. . . ... .
. . . ... .
And, I want something like this after the concatenation:
concatenated to 1D array:
[3,4,...,5,8,...,6,0,...,9,22]
I may also want to concatenate only some of the columns say from columns #1 to 3:
concatenated to column 1-3:
[3,4,...,5,8,...,6,0]
What is a convenient way to do this? Should I convert the pd df into a numpy array?
So far, I have found the solutions that concatenate a panda dataframe using the column headers which is not practical for hundreds of columns. In another approch, columns of multiple dataframes are concatenated using pd.concatenate(). But, I want to concatenate the columns of a single dataframe. This issue is a minor part of a complex processing I am currently working on. So, I would appreciate a straightforward answer.
CodePudding user response:
# stack and take the values
# choosing all columns and rows
df.T.stack().values
array([3, 4, 5, 8, 6, 0, '...', '...', 9, 22], dtype=object)
# choosing only two columns
df.T.iloc[:,1:3].stack().values
array([4, 8, 0, '...', 22], dtype=object)