Home > Mobile >  Perform an operation on a subset of columns where column name contains string?
Perform an operation on a subset of columns where column name contains string?

Time:08-30

I have df1 and df2. I want to difference them, df_new = df1 - df2. However a few of the columns of df1 and df2 (they have the same columns), I want to do extra operations.

Something like:

df_new = df1 - df2 # element-wise on all rows and cols
df_new[df_new[columns where column.name.contains("price_")] *= 100 # mult every row for these cols by 100
df_new[df_new[columns where column.name.contains("spread_")] /= df1 # element-wise for those cols

I'm not sure if something like is possible or what the proper functions/syntax would be. Basically after forming df_new I just want to go in and conveniently edit some columns, which I identify by a string phrase. Thanks.

CodePudding user response:

You need to use str.contains with loc to reach those columns:

df_new.loc[:, df_new.columns.str.contains("price_")] *= 100 # mult every 
row for these cols by 100
df_new.loc[:, df_new.columns.str.contains("spread_")] /= df1

CodePudding user response:

You can try:

df_new.loc[:, df_new.columns.str.contains('price_')] *= 100
df_new.loc[:, df_new.columns.str.contains('spread_')] /= df1

Note you may find .str.startswith and .str.endswith helpful as well.

CodePudding user response:

IIUC, you can use df.filter.

df.filter(like="price_")

With like it returns all columns that have price_ in their name (substring). For more complex patterns, use regex instead.

  • Related