I have columns like 'Name, Gender, Child_1, Child_2, Child_3, Child_4 .... ' . Lets say, I want to drop Child columns, then I can use this:
df.loc[:, ~df.columns.str.contains("Child")]
What if I have 'Name, Gender, Child_1, Child_2, Child_3, Child_4, Cousin_1, Cousin_2, Cousin_3 ... ' How can I drop multiple such columns at once?
Exact case is, 'How to drop multiple columns by providing a list of substrings'. My requirement is that, I have an array ['Child','Cousin'...] , that should be enough to drop columns that contains either Child or Cousin or some other element that I give, to be dropped from the dataframe.
I think I can use df.loc logic multiple times in a loop, but I think that's costly. So, I'm looking for options.
CodePudding user response:
Use join
with |
for regex or
for test all value of list:
L = ['Child','Cousin'...]
df.loc[:, ~df.columns.str.contains('|'.join(L))]