Home > Software design >  How to replace a substring in one column based on the string from another column?
How to replace a substring in one column based on the string from another column?

Time:08-20

I'm working with a dataset of Magic: The Gathering cards. What I want is if a card references it's name in it's rules text, for the name to be replaced with "This_Card". Here is what I've tried:

card_text['text_unnamed'] = card_text[['name', 'oracle_text']].apply(lambda x: x.oracle_text.replace(x.name, 'This_Card') if x.name in x.oracle_text else x, axis = 1)

This is giving me the error "TypeError: 'in ' requires string as left operand, not int"

I've tried with axis = 1, 0 and no axis. Still getting errors.

In editing my code to output what x.name is, it has revealed that it is just the int 2. I'm not sure why this is happening. Everything in the name column is a string. What is causing this interaction and how can I prevent it?

Here is a sample of my data. enter image description here

CodePudding user response:

x.name isn't always a string so you cant perform <int> in <string>

I can't say for sure without seeing the data.

but I guess adding this line before your code will do it

card_text[['name', 'oracle_text']] = card_text[['name', 'oracle_text']].astype(str)

which simply convert all data in both columns to strings

CodePudding user response:

Series.name is a built-in attribute, so it won't access the column when you call x.name. Instead, you need use x['name'] to access name column

What's more efficient is to conditionally replace with a mask rather than apply

m = card_text['oracle_text'].str.contains(card_text['name'])
card_text[m, 'text_unnamed'] = card_text['oracle_text'].replace(card_text['name'].tolist(), 'This_Card', regex=True)
  • Related