Home > Software design >  Concat values on dataframe columns excluding NaN's
Concat values on dataframe columns excluding NaN's

Time:05-19

I have a dataframe with n store columns, here I'm just showing the first 2:

     ref_id  store_0     store_1
0       100        c           b
1       300        d         NaN

I want a way to concat only the non-NaN values from store columns into a new column adding a comma between each value, and finally drop those columns. Desired output is:

     ref_id  stores  
0       100     c,b
1       300       d

Right now I've tried df['stores'] = df['store_0'] ',' df['store_1'] with this result:

     ref_id  store_0  store_1  stores    
0       100        c        b     c,b
1       300        d      NaN     NaN

CodePudding user response:

Try with

df['store'] = df.filter(like = 'store').apply(lambda x : ','.join(x[x==x]),1)
df
Out[60]: 
   ref_id store_0 store_1 store
0     100       c       b   c,b
1     300       d     NaN     d

CodePudding user response:

You can use:

cols = df.filter(like='store_').columns

df2 = (df
 .drop(columns=cols)
 .assign(stores=df[cols].agg(lambda s: s.dropna()
                                        .str.cat(sep=','), 
                             axis=1))
 )

Or, for in place modification:

cols = df.filter(like='store_').columns

df['stores'] = df[cols].agg(lambda s: s.dropna().str.cat(sep=','), axis=1)
df.drop(columns=cols, inplace=True)

Output:

   ref_id stores
0     100    c,b
1     300      d

CodePudding user response:

You can try

df_ = df.filter(like='store')
df = (df.assign(store=df_.apply(lambda row : row.str.cat(sep=','), axis=1))
      .drop(df_.columns, axis=1))
print(df)

   ref_id store
0     100   c,b
1     300     d
  • Related