Home > Net >  How to remove the extra commas from the list of email addresses
How to remove the extra commas from the list of email addresses

Time:03-30

I am using python and have a string of email addresses as shown below.

email_addr = '[email protected], [email protected], [email protected]'

Above string looks good, however some time i received the data that have blank email addresses in them.

For e.g.

email_addr = ' , , [email protected], [email protected], , , ,[email protected]

I am using str.split(',') and lot of errors checking. Wondering if is there a better way to do this?

Final value i am expecting from:

email_addr = ' , , [email protected], [email protected], , , ,[email protected]

to:

email_addr = '[email protected],[email protected],[email protected]'

CodePudding user response:

Try:

import re

email_addr = " , , [email protected], [email protected], , , ,[email protected]"

email_addr = email_addr.replace(" ", "").strip(",")
email_addr = re.sub(r",{2,}", ",", email_addr)
print(email_addr)

Prints:

[email protected],[email protected],[email protected]

CodePudding user response:

No need for regular expressions. Use .split(',') to split into a list of strings.

email_lst = email_addr.split(',')

Then join with comma, but filter out blank values

email_addr2 = ",".join(e.strip() for e in email_lst if e.strip())
# '[email protected],[email protected],[email protected]'

In Python 3.8 , you can use the walrus operator to avoid calling .strip() twice:

email_addr2 = ",".join(e for ee in email_lst if (e := ee.strip()))

CodePudding user response:

If we use regex, how about getting a list of matches with [^, ] and then joining all the items?

[^, ] means any char except , and , and means "1 or more"

import re

email_addr = " , , [email protected], [email protected], , , ,[email protected]"

email_cleaned = ",".join(re.findall("[^, ] ", email_addr))

print(email_cleaned)

CodePudding user response:

I'd be quite tempted to validate as you go and rely on email.utils.parseaddr which will somewhat ensure email clients will accept them

>>> parse_email_addr("Foo Bar <[email protected]>")
('Foo Bar', '[email protected]')
from email.utils import parseaddr as parse_email_addr
email_addr = ' , , [email protected], [email protected], , , ,[email protected]'
result = ",".join(filter(None, (parse_email_addr(email)[1] for email in email_addr.split(","))))
# '[email protected],[email protected],[email protected]'

I'd also be tempted to account for bad fields, which may represent some input error (ie. how did you get these? should they be correct as inputs to your program?)

>>> result
'[email protected],[email protected],[email protected]'
>>> email_addr.rstrip(",").count(",") - result.count(",")
5
  • Related