Home > Back-end >  How to concatenate values from many columns into one column when one doesn't know the number of
How to concatenate values from many columns into one column when one doesn't know the number of

Time:03-16

My application saves an indeterminate number of values in different columns. As a results, I have a data frame with a certain number of columns at the beginning but then from a particular column (that I know) I will have an uncertain number of columns saving same data

Example:

known1  known2 know3 unknow1 unknow2 unknow3 ...
1       3      3     data    data2   data3

The result I would like to get should be something like this

known1  known2 know3 all_unknow 
1       3      3     data,data2,data3

How can I do this when I don't know the number of unknown columns but what I do know is this will occur (in this example) from the 4th column.

CodePudding user response:

IIUC, use filter to select the columns by keyword:

cols = list(df.filter(like='unknow'))
# ['unknow1', 'unknow2', 'unknow3']

df['all_unknow'] = df[cols].apply(','.join, axis=1)
df = df.drop(columns=cols)

or take all columns from the 4th one:

cols = df.columns[3:]

df['all_unknow'] = df[cols].apply(','.join, axis=1)
df = df.drop(columns=cols)

output:

   known1  known2  know3        all_unknow
0       1       3      3  data,data2,data3

CodePudding user response:

df['all_unknown'] = df.iloc[:, 3:].apply(','.join, axis=1)

if you also want to drop all columns after the 4th:

cols = df.columns[3:-1]
df.drop(cols, axis=1)

the -1 is to avoid dropping the new column

  • Related