Home > OS >  How to read HTML email - Python
How to read HTML email - Python

Time:09-17

I would like to read emails from IMAP mailbox and extract "From", "Subject" and "Body" (which is HTML) every time new email comes in, it should make the unread email read and eventually put email in a dictionary. I kind of did the whole thing, except the part of changing unread email to read. That doesn't seem possible with the 'imbox' module I used. I avoid using imaplib as it seems quite low level/complex and it should be done in an easier way I think, of course if there's no other way, imaplib has to be used.

Here's the code:

from imbox import Imbox
import html2text

with Imbox('<IMAP SERVER>',
username='<USER>',
password='<PASS>',
ssl=True,
ssl_context=None,
starttls=False) as imbox:

unread_inbox_messages = imbox.messages(unread=True)
for uid, message in unread_inbox_messages:
    mail_from = message.sent_from[0]['email']
    mail_subject =  message.subject
    h = html2text.HTML2Text()
    h.ignore_links = True
    output = (h.handle(f'''{message.body['plain']}''').replace("\\r\\n", ""))
    output = output.replace("\n", "")
    mail_body = output[2:-2]
    mail_dict = {
        'email': {
            'From': mail_from,
            'Subject': mail_subject,
            'Body': mail_body
        }
    }
print(mail_dict)

It returns a row like this:

{'email': {'From': '[email protected]', 'Subject': 'subject', 'Body': 'body message'}} 

but email remains unread in the mailbox, so every time it takes the same unread emails. Can my code be modified so that emails are changed from unread to read, with some additional module maybe?

CodePudding user response:

As per documentation you can mark an email as read using function mark_seen with uid.

I also added example code at below.

from imbox import Imbox
with Imbox('imap.gmail.com', username='username', password='password',
        ssl=True, ssl_context=None, starttls=False) as imbox:

        # fetch all messages from inbox
        all_inbox_messages = imbox.messages()
        
        for uid, message in all_inbox_messages:
        
            # mark the message as read
            imbox.mark_seen(uid)

CodePudding user response:

Try lib: https://github.com/ikvk/imap_tools

from imap_tools import MailBox

with MailBox('imap.mail.com').login('[email protected]', 'pwd') as mailbox:
    for msg in mailbox.fetch():  # all by default, mark_seen=True by default
        from_ = msg.from_
        subject = msg.subject
        body = msg.text or msg.html
        uids_for_move = []
        if 'cat' in body:
            uids_for_move.append(msg.uid)
    mailbox.move(uids_for_move, 'INBOX/cats')

Also, if mark_seen=False, you may use mailbox.flag for set MailMessageFlags.SEEN flag

Regards, lib author.

  • Related