I am trying to decode what texts GMails sends, which should be utf7-imap (actually, if I am not mistaking, utf8 encoded inside utf7?)
I have read: https://en.wikipedia.org/wiki/UTF-7 I am using: https://github.com/skeeto/utf-7 to parse the (for example) the text - and mimetic (https://github.com/tat/mimetic) to parse the raw email text sent.
The corresponding header (subject in this case) is:
Subject: =?UTF-8?B?15TXldeT16LXlCDXotecINeQ15kg15TXoteR16jXqiDXqtep15zXlQ==?=
=?UTF-8?B?150g16rXp9eV16TXqteZINeR16rXm9eg15nXqiDXnNep15vXmdeo15nXnQ==?=
The encoding mentioned in the comments, is only for the content (body). Headers should be in ASCII only, but some email client do send some kind of 8bit encoding (ISO-8859-?). This is not the case for the message I describe.
I assume there is something else I am missing - where can I find documentation about this subject?
I am looking for solutions in C or C (the utf7 library I am using is C, and the mime parsing library is in C ). C is always a better alternative.
CodePudding user response:
UTF-7 is used to encode non-ASCII mailbox names in IMAP protocol. This is not related to your example, which shows the RFC 2822 Subject filed with MIME-encoded value according to RFC 2047.
In your example (with the "=?UTF-8?B?" prefix) decoding is simple: the string that follows (up to "?=") is a base64 presentation of an utf-8 encoded string.