Home > Net >  Select column names in pandas based on multiple prefixes
Select column names in pandas based on multiple prefixes

Time:05-24

I have a large dataframe, from which I want to select specific columns that stats with several different prefixes. My current solution is shown below:

df = pd.DataFrame(columns=['flg_1', 'flg_2', 'ab_1', 'ab_2', 'aaa', 'bbb'], data=np.array([1,2,3,4,5,6]).reshape(1,-1))
flg_vars = df.filter(regex='^flg_')
ab_vars = df.filter(regex='^ab_')

result = pd.concat([flg_vars, ab_vars], axis=1)

Is there a more efficient way of doing this? I need to filter my original data based on 8 prefixes, which leads to excessive lines of code.

CodePudding user response:

Use | for regex OR:

result = df.filter(regex='^flg_|^ab_')
print (result)
   flg_1  flg_2  ab_1  ab_2
0      1      2     3     4
  • Related