Home > Mobile >  Delete rows based on condition and keep e-mails
Delete rows based on condition and keep e-mails

Time:01-19

I have this dataframe that have to contain only e-mails:

email
1   [email protected]                         #it is not an e-mail so delete it
2   [email protected]    #it is a a e-mail so keep it
3   [email protected]                     #it is not an e-mail so delete it
4   [email protected]                       #...

How can i delete these rows that aren't e-mail? Maybe based on a condition that if the next value after the point (.) is a number or a .png (or other type image) delete, how to achive this? do you have a better solution ?

Update:

This is the condition i used for scrap them:

mail_list = re.findall('\w @\w \.{1}\w ', html_text)

CodePudding user response:

Only you know the specific selection condition but assuming @ is followed by a non-digit you could use:

df2 = df[df['email'].str.contains(r'@\D', regex = True)]

CodePudding user response:

You could use the regex like:

df2 = df[df['email'].str.contains(r'^[a-z0-9] [\._]?[a-z0-9] [@]\w [.]\w{2,3}$' , regex = True)]
  • Related