Home > Software engineering >  Is there a way to make this more concise and less repetitive? Or, is hard-coding the best option her
Is there a way to make this more concise and less repetitive? Or, is hard-coding the best option her

Time:08-17

df.loc[df['Campaign'].str.contains('Bathroom'), 'Product'] = 'Bathroom'
df.loc[df['Campaign'].str.contains('Roof'), 'Product'] = 'Roofing'
df.loc[df['Campaign'].str.contains('Siding'), 'Product'] = 'Siding'
df.loc[df['Campaign'].str.contains('Window'), 'Product'] = 'Window'
df.loc[df['Campaign'].str.contains('Water Treatment'), 'Product'] = 'Water Treatment'
df.loc[df['Campaign'].str.contains('Water Heater'), 'Product'] = 'Water Heater'
df.loc[df['Campaign'].str.contains('Pump'), 'Product'] = 'Pump'
df.loc[df['Campaign'].str.contains('Granulator'), 'Product'] = 'Granulator'
df.loc[df['Campaign'].str.contains('Plumbing'), 'Product'] = 'Plumbing'
df.loc[df['Campaign'].str.contains('Door'), 'Product'] = 'Door'

I feel going through and checking one by one like this is unprofessional. Any other ways recommended?

CodePudding user response:

Use a list:

items = ['Bathroom', 'Roofing', 'Siding']  # etc, etc

for item in items:
    df.loc[df['Campaign'].str.contains(item), 'Product'] = item

CodePudding user response:

If you want it really concise (and prevent looping through your entire DataFrame for each item) you could use the Series.replace() function as well.

items = ['Bathroom', 'Roofing', 'Siding']  # etc, etc
df.loc[:, 'Product'] = df['Campaign'].str.replace(fr'.*({items.join(|)}).*', lambda match: match.group(0), regex=True)

I did not have an example to test this though ;)

This uses a regex match to check whether any item is in your string and replaces the 'Product' by the match if it is.

  • Related