Home > other >  Translate encoding of android mail in R
Translate encoding of android mail in R

Time:04-08

Problem

I am using the R package mRpostman to access my mail account using R. And everything works just fine when I fetch the mails that were sent from my computer to the dedicated mail adress via Thunderbird. But when I use my Android phone to do the same, then the text is weirdly encoded and not legible anymore. How do I fix that? I have tried using base64enc::base64decode() but I could not get that to work. I failed similarly by trying fo change the encoding via Encoding().

Reprex

I sent two mails. One from my Computer using Thunderbird and the text is simply "Sent from Thunderbird on Computer". The other mail was sent using my Android phone using the default mail app. This one contains only the text "Sent from Android".

library(mRpostman) # for email communication

# Connect to mail server
imap_mail <- 'imaps://imap.gmail.com' # mail client
user_mail <- keyring::key_get('dataviz-mail')
password_mail <- keyring::key_get('dataviz-mail-password')
# Establish connection to imap server
con <- configure_imap(
  url = imap_mail,
  user = user_mail,
  password = password_mail
)

# Switch to Inbox
con$select_folder('Inbox') 

# Fetch Thunderbird mail
con$fetch_text(11)
#> $text11
#> [1] "Sent from thunderbird on computer\r\n\r\n"

# Fetch Android mail
con$fetch_text(12)
#> $text12
#> [1] "----_com.samsung.android.email_7640956728775490\r\nContent-Type: text/plain; charset=utf-8\r\nContent-Transfer-Encoding: base64\r\n\r\nVGhpcyBtYWlsIGlzIHNlbnQgZnJvbSBBbmRyb2lk\r\n\r\n----_com.samsung.android.email_7640956728775490\r\nContent-Type: text/html; charset=utf-8\r\nContent-Transfer-Encoding: base64\r\n\r\nPGh0bWw PGhlYWQ PG1ldGEgaHR0cC1lcXVpdj0iQ29udGVudC1UeXBlIiBjb250ZW50PSJ0ZXh0\r\nL2h0bWw7IGNoYXJzZXQ9VVRGLTgiPjwvaGVhZD48Ym9keSBkaXI9ImF1dG8iPlRoaXMgbWFpbCBp\r\ncyBzZW50IGZyb20gQW5kcm9pZDwvYm9keT48L2h0bWw \r\n\r\n----_com.samsung.android.email_7640956728775490--\r\n\r\n"

Created on 2022-04-06 by the reprex package (v2.0.0)

CodePudding user response:

The android string does contain the message in base 64 encoding, but it is embedded in other, non-base64 encoded text, so you have to extract it.

If we take the string from your question:

text12 <-  "----_com.samsung.android.email_7640956728775490\r\nContent-Type: text/plain; charset=utf-8\r\nContent-Transfer-Encoding: base64\r\n\r\nVGhpcyBtYWlsIGlzIHNlbnQgZnJvbSBBbmRyb2lk\r\n\r\n----_com.samsung.android.email_7640956728775490\r\nContent-Type: text/html; charset=utf-8\r\nContent-Transfer-Encoding: base64\r\n\r\nPGh0bWw PGhlYWQ PG1ldGEgaHR0cC1lcXVpdj0iQ29udGVudC1UeXBlIiBjb250ZW50PSJ0ZXh0\r\nL2h0bWw7IGNoYXJzZXQ9VVRGLTgiPjwvaGVhZD48Ym9keSBkaXI9ImF1dG8iPlRoaXMgbWFpbCBp\r\ncyBzZW50IGZyb20gQW5kcm9pZDwvYm9keT48L2h0bWw \r\n\r\n----_com.samsung.android.email_7640956728775490--\r\n\r\n"

Then we can carve out the base 64 string, decode it to bytes and convert to character like this:

library(dplyr)
library(purrr)
library(base64enc)

text12 %>%
  strsplit("base64\r\n\r\n") %>%
  pluck(1, 2) %>%
  strsplit("----") %>%
  pluck(1, 1) %>%
  gsub(pattern = "[\r\n] ", replacement = "", .) %>%
  base64decode() %>%
  rawToChar()
#> [1] "This mail is sent from Android"

Created on 2022-04-06 by the reprex package (v2.0.1)

  • Related