I have a large dataframe with 600 columns
Approx 40 of these will have the word 'austria' in them. If I'm making a new dataframe just for austrian data is there an easy way to create a new data frame based on the column header?
Any help much appreciated, Thanks
CodePudding user response:
You can use filter
:
df2 = df.filter(regex='(?i)austria') # (?i) makes the search case insensitive
Example:
df = pd.DataFrame(columns=['austria something', 'something austria',
'another austria', 'unrelated', 'Austria again'],
index=[0])
df.filter(regex='(?i)austria')
output:
austria something something austria another austria Austria again
0 NaN NaN NaN NaN
CodePudding user response:
Another way using .loc
which allows you to filter with booleans across a certain index and .str.contains
df2 = df.loc[:,df.columns.str.contains('austria',case=False)]
austria something something austria another austria Austria again
0 NaN NaN NaN NaN