I have a data frame and I want to delete rows that in the column "Phrase", pattern "___" exists.
Index | PHRASE | Label |
---|---|---|
0 | proposed by the president of the | 1 |
1 | Living ___ | 1 |
2 | "Murder, ___ Wrote" | 0 |
But Imagin that the data fram has 2,000,000 enteries
import re
df_clean = pd.DataFrame()
z = 0
y = 0
for i in df_original["PHRASE"]:
x = re.search("___", i)
if x:
y = y 1
else:
df_clean.append([i])
z = z 1
this is what I came up with so far, I know it's not right, Does anyone know the answer? (by the way append takes a lot of time)
CodePudding user response:
df[~df['phrase'].str.contains('___')]
Where the ~
symbol negates the operation.