Home > front end >  Remove Word not in list Pandas Column
Remove Word not in list Pandas Column

Time:05-26

Below is the DF,

df = pd.DataFrame({'cd1' : ['PFE1', 'PFE25', np.nan, np.nan], 
                   'cd2' : [np.nan, 'PFE28', 'PFE23', 'PFE14'], 
                   'cd3' : ['PFE15', 'PFE2', 'PFE83', np.nan], 
                   'cd4' : ['PFE25', np.nan, 'PFE39', 'PFE47'], 
                   'cd5' : [np.nan, 'PFE21', 'PFE53', 'PFE15']})

df
        cd1     cd2     cd3     cd4     cd5
    0   PFE1    NaN     PFE15   PFE25   NaN
    1   PFE25   PFE28   PFE2    NaN     PFE21
    2   NaN     PFE23   PFE83   PFE39   PFE53
    3   NaN     PFE14   NaN     PFE47   PFE15

There 2 tasks to perform which I have some issues

  1. Combine Columns. As you can see below even wile filling NaN with nothing, the ',' is adding (not needed)
    df = df.fillna('')
    df['combined'] = df.values.tolist()
    df

Output

    cd1     cd2     cd3      cd4    cd5       combined
    PFE1            PFE15    PFE25            ['PFE1', , 'PFE15', 'PFE25', ]
    PFE25   PFE28   PFE2            PFE21     ['PFE25', 'PFE28', 'PFE2', , 'PFE21']
         PFE23  PFE83    PFE39  PFE53     [, 'PFE23', 'PFE83', 'PFE39', 'PFE53']
      PFE14          PFE47   PFE15    [, 'PFE14', , 'PFE47', 'PFE15']
  1. Remove words from the combined column which are not in spec_list
    spec_list = ['PFE15', 'PFE25']

Expected output

    combined                                   new_col
    ['PFE1', , 'PFE15', 'PFE25', ]             ['PFE15', 'PFE25']
    ['PFE25', 'PFE28', 'PFE2', , 'PFE21']      ['PFE25']
    [, 'PFE23', 'PFE83', 'PFE39', 'PFE53']     
    [, 'PFE14', , 'PFE47', 'PFE15']            ['PFE15']

CodePudding user response:

Try this using agg with axis=1:

df['combined'] = df.agg(lambda x: list(x.dropna()), axis=1)

spec_list = ['PFE15', 'PFE25']

df['new_col'] = df.agg(lambda x: list(x[x.isin(spec_list)]), axis=1)

Output:

     cd1    cd2    cd3    cd4    cd5                      combined         new_col
0   PFE1    NaN  PFE15  PFE25    NaN          [PFE1, PFE15, PFE25]  [PFE15, PFE25]
1  PFE25  PFE28   PFE2    NaN  PFE21   [PFE25, PFE28, PFE2, PFE21]         [PFE25]
2    NaN  PFE23  PFE83  PFE39  PFE53  [PFE23, PFE83, PFE39, PFE53]              []
3    NaN  PFE14    NaN  PFE47  PFE15         [PFE14, PFE47, PFE15]         [PFE15]
  • Related