Home > other >  Drop DataFrame specific columns if absent in a list, keeping every other columns
Drop DataFrame specific columns if absent in a list, keeping every other columns

Time:08-18

I have a dataframe df that looks like this:

    FT1       FT2      ...   FT32      Style       Rank    ...
0   0.02      0.01           0.01      Black       7
1   0.01      0.04           0.01      Death       2
2   0.01      0.01           0.01      Hardcore    6
3   0.04      0.01           0.02      Grindcore   1
4   0.02      0.02           0.02      Deathcore   4
...

And I want to remove every "FT" column which index is not in a list, for example the list FT_index=[1,4,5,7]. So in that case, I want to transform the dataframe in order for it to look like this:

    FT1       FT4      FT5      FT7      Style       Rank  ...
0   0.02      0.01     0.01     0.01      Black       7
1   0.01      0.04     0.06     0.01      Death       2
2   0.01      0.01     0.05     0.01      Hardcore    6
3   0.04      0.01     0.01     0.02      Grindcore   1
4   0.02      0.02     0.01     0.02      Deathcore   4
...

As indicated by the "...", I have many other columns in this dataframe. The number of "FT" columns (here 32) is a variable. I know I can remove columns using the following line df.drop('FT' str(ft_index), inplace=True, axis=1) with a loop, and with ft_index an element of FT_index, but this will drop the columns that I want to keep and not the opposite. Has anyone any idea on how to do what I want to do efficiently ?

CodePudding user response:

You can extract the FT columns, compare to your modified list and use it to drop:

FT_index=[1,4,5,7]

cols = df.filter(like='FT').columns
df2 = df.drop(columns=cols[~cols.isin([f'FT{i}' for i in FT_index])])

output:

    FT1      Style  Rank
0  0.02      Black     7
1  0.01      Death     2
2  0.01   Hardcore     6
3  0.04  Grindcore     1
4  0.02  Deathcore     4

Intermediates:

cols
# Index(['FT1', 'FT2', 'FT32'], dtype='object')

[f'FT{i}' for i in FT_index]
# ['FT1', 'FT4', 'FT5', 'FT7']

cols.isin([f'FT{i}' for i in FT_index])
# array([ True, False, False])

cols[~cols.isin([f'FT{i}' for i in FT_index])]
# Index(['FT2', 'FT32'], dtype='object')
  • Related