How to concatenate values from many columns into one column when one doesn't know the number of-CodePudding

My application saves an indeterminate number of values in different columns. As a results, I have a data frame with a certain number of columns at the beginning but then from a particular column (that I know) I will have an uncertain number of columns saving same data

Example:

known1  known2 know3 unknow1 unknow2 unknow3 ...
1       3      3     data    data2   data3

The result I would like to get should be something like this

known1  known2 know3 all_unknow 
1       3      3     data,data2,data3

How can I do this when I don't know the number of unknown columns but what I do know is this will occur (in this example) from the 4th column.

CodePudding user response：

IIUC, use filter to select the columns by keyword:

cols = list(df.filter(like='unknow'))
# ['unknow1', 'unknow2', 'unknow3']

df['all_unknow'] = df[cols].apply(','.join, axis=1)
df = df.drop(columns=cols)

or take all columns from the 4th one:

cols = df.columns[3:]

df['all_unknow'] = df[cols].apply(','.join, axis=1)
df = df.drop(columns=cols)

output:

   known1  known2  know3        all_unknow
0       1       3      3  data,data2,data3

CodePudding user response：

df['all_unknown'] = df.iloc[:, 3:].apply(','.join, axis=1)

if you also want to drop all columns after the 4th:

cols = df.columns[3:-1]
df.drop(cols, axis=1)

the -1 is to avoid dropping the new column