I have a dataframe df
that looks like this:
FT1 FT2 ... FT32 Style Rank ...
0 0.02 0.01 0.01 Black 7
1 0.01 0.04 0.01 Death 2
2 0.01 0.01 0.01 Hardcore 6
3 0.04 0.01 0.02 Grindcore 1
4 0.02 0.02 0.02 Deathcore 4
...
And I want to remove every "FT" column which index is not in a list, for example the list FT_index=[1,4,5,7]
. So in that case, I want to transform the dataframe in order for it to look like this:
FT1 FT4 FT5 FT7 Style Rank ...
0 0.02 0.01 0.01 0.01 Black 7
1 0.01 0.04 0.06 0.01 Death 2
2 0.01 0.01 0.05 0.01 Hardcore 6
3 0.04 0.01 0.01 0.02 Grindcore 1
4 0.02 0.02 0.01 0.02 Deathcore 4
...
As indicated by the "...", I have many other columns in this dataframe. The number of "FT" columns (here 32) is a variable. I know I can remove columns using the following line df.drop('FT' str(ft_index), inplace=True, axis=1)
with a loop, and with ft_index
an element of FT_index
, but this will drop the columns that I want to keep and not the opposite. Has anyone any idea on how to do what I want to do efficiently ?
CodePudding user response:
You can extract the FT columns, compare to your modified list and use it to drop
:
FT_index=[1,4,5,7]
cols = df.filter(like='FT').columns
df2 = df.drop(columns=cols[~cols.isin([f'FT{i}' for i in FT_index])])
output:
FT1 Style Rank
0 0.02 Black 7
1 0.01 Death 2
2 0.01 Hardcore 6
3 0.04 Grindcore 1
4 0.02 Deathcore 4
Intermediates:
cols
# Index(['FT1', 'FT2', 'FT32'], dtype='object')
[f'FT{i}' for i in FT_index]
# ['FT1', 'FT4', 'FT5', 'FT7']
cols.isin([f'FT{i}' for i in FT_index])
# array([ True, False, False])
cols[~cols.isin([f'FT{i}' for i in FT_index])]
# Index(['FT2', 'FT32'], dtype='object')