Home > front end >  How to decode base64 strings in a vectorized way within dplyr::mutate?
How to decode base64 strings in a vectorized way within dplyr::mutate?

Time:03-15

I have a tibble which contains a column of base64-encoded strings like so:

mytib <- tibble(encoded_var = c("VGVzdGluZ3Rlc3Rpbmc=", "QW5vdGhlcnRlc3Q="))

When I try to decode it with base64::base64decode

mytib %>%
     mutate(decoded_var = base64decode(encoded_var))

I receive an error:

Error in `mutate()`:
! Problem while computing `decoded_var = base64decode(encoded_var)`.
x `decoded_var` must be size 2 or 1, not 25.

I'm looking to have a tibble with a column of decoded, human-readable base64 strings. I'd also like to do that using the mutate tidyverse syntax. How can I achieve that?

Update: The tibble should look like this

# A tibble: 2 × 2
  encoded_var              decoded_var
  <chr>                    <chr>
1 VGVzdGluZ3Rlc3Rpbmc=     Testingtesting
2 QW5vdGhlcnRlc3Q=         Anothertest

CodePudding user response:

base64enc::base64decode produces a raw vector, so you need to carry out the conversion rowwise and wrap the result with rawToChar:

mytib %>% 
  rowwise() %>% 
  mutate(decoded_var = rawToChar(base64decode(encoded_var)))
#> # A tibble: 2 x 2
#> # Rowwise: 
#>   encoded_var          decoded_var   
#>   <chr>                <chr>         
#> 1 VGVzdGluZ3Rlc3Rpbmc= Testingtesting
#> 2 QW5vdGhlcnRlc3Q=     Anothertest   

CodePudding user response:

The problem is that the caTools::base64decode function only works on one string at a time, because a single string could contain several values. If you always have a single character value in your variable, then you can vectorize it:

library(tidyverse)
mytib <- tibble(encoded_var = c("VGVzdGluZ3Rlc3Rpbmc=", "QW5vdGhlcnRlc3Q="))
mytib %>%
     mutate(decoded_var = Vectorize(caTools::base64decode)(encoded_var, "character"))
#> # A tibble: 2 × 2
#>   encoded_var          decoded_var   
#>   <chr>                <chr>         
#> 1 VGVzdGluZ3Rlc3Rpbmc= Testingtesting
#> 2 QW5vdGhlcnRlc3Q=     Anothertest

Created on 2022-03-14 by the reprex package (v2.0.1)

EDITED TO ADD: Actually, there are (at least) four different packages that provide base64decode functions. I used caTools. There are also versions in the processx, xfun and base64enc packages. (The one in xfun is actually named base64_decode.) This is why it's important to show reproducible code here on StackOverflow. The reprex package makes this very easy.

  • Related