Home > database >  creating list from dataframe where column name contains particular string
creating list from dataframe where column name contains particular string

Time:05-23

I referred to the solution here - Find column whose name contains a specific string

spike_cols = [col for col in df.columns if 'spike' in col]

However, I want to search for multiple strings at once. For example, I tried to search for string 'keyword' by implementing -

spike_cols = [col for col in df.columns if 'spike | keyword' in col]

This doesn't return the list I desire. Any pointers on how to proceed ?

CodePudding user response:

Since it seems you have a pandas df, try this

# build a boolean filter that filters column names containing spike or keyword
spike_cols = df.columns.str.contains('spike|keyword')
# filter columns
df.loc[:, spike_cols]

If you want the column names themselves, filter the column index using

df.columns[spike_cols]

CodePudding user response:

Also possible:

# create list for words
words = ['spike', 'keyword']

# store col names in list
cols = list(df.columns)

# list comprehension to return idxs for matches
idx_matches = [i for i in range(len(cols)) if cols[i] in words]

# access the df cols
df.iloc[:, idx_matches]
  • Related