Home > Software engineering >  How to iterate over rows from column1 and modify column2 depending on whether regex is matched in co
How to iterate over rows from column1 and modify column2 depending on whether regex is matched in co

Time:01-16

I wish to add a column to my document indicating whether or not my regex was matched in another column. Such as to go from:

Column A
word regex word
word word word
word word word
word regex word

to

Column A Column B
word regex word True
word word word False
word word word False
word regex word True

I doubled checked my regex and it works just fine, so the problem does not come from that.

I tried

  1. iterating over the rows and changing them depending on whether the regex is matched
for row in FILE.itertuples():
       if FILE.COLUMNTOSEARCH.contains(REGEX):
            FILE.at[row.Index, "NEWCOLUMN"] = "string1"
       else:
            FILE.at[row.Index, "NEWCOLUMN"] = "string2"

This returns the error: "AttributeError: 'Series' object has no attribute 'contains'"

  1. duplicating the first column and then using replace
FILE.replace(REGEX, regex=True, value="string1", inplace=True)
FILE.replace(REGEX, regex=False, value="string2", inplace=True)

For this, only the "string1" appears, and it doesnt replace the whole row, just where the regex is found although I wish to for "string1" to be the only string in the entry.

I've looked at all the stackoverflow possible documentation without being able to figure anything. I feel like both those solutions are highly inefficient but cannot understand how to write something better. Thanks in advance for any help/solution.

CodePudding user response:

You can use .str.contains:

df["Column B"] = df["Column A"].str.contains(r"\bregex\b")

This outputs:

          Column A  Column B
0  word regex word      True
1  word word word      False
2  word word word      False
3  word regex word      True
  • Related