Home > database >  How to relabel data in R given a dataframe and json metadata
How to relabel data in R given a dataframe and json metadata

Time:04-26

I know how to do this in python pandas, but am not sure how to go about it in R.

In pandas it would be:

In [11]: df = pd.DataFrame({"a" : [1, 2, 1], "b" : [0, 1, 0]})

In [12]: df
Out[12]:
   a  b
0  1  0
1  2  1
2  1  0

In [13]: meta = {"a" : {1 : "one", 2 : "two"}, "b"  : {1 : "Yes", 0 : "No"}}

In [14]: df.replace(meta)
Out[14]:
     a    b
0  one   No
1  two  Yes
2  one   No

How to do the same thing in R?

CodePudding user response:

Another possible solution:

library(dplyr)

df <- data.frame(
  a = c(1L, 2L, 1L),
  b = c(0L, 1L, 0L)
)

x <- c("one", "two")
y <- c("yes", "no")

df %>% 
  mutate(a = x[a]) %>% 
  mutate(b = y[b 1])

#>     a   b
#> 1 one yes
#> 2 two  no
#> 3 one yes

Or more generically:

library(dplyr)

x <- c("one", "two")
y <- c("yes", "no")

names(x) <- 1:2
names(y) <- 0:1

df %>% 
  mutate(a = x[as.character(a)]) %>% 
  mutate(b = y[as.character(b)])

#>     a   b
#> 1 one yes
#> 2 two  no
#> 3 one yes

CodePudding user response:

There is the function dplyr::recode to do this. !!! can be used to unpack the list into function arguments. R does not have an equivalent of numpy's dtype object. The replacement will change the column type, so it needs to be character in the first place.

library(tidyverse)

df <- data.frame(a = c(1,2,1), b=c(0,1,0))
df
#>   a b
#> 1 1 0
#> 2 2 1
#> 3 1 0

meta <- list(
  a = list(`1`="one", `2` = "two"),
  b = list(`1`="Yes", `0` = "No")
)
meta
#> $a
#> $a$`1`
#> [1] "one"
#> 
#> $a$`2`
#> [1] "two"
#> 
#> 
#> $b
#> $b$`1`
#> [1] "Yes"
#> 
#> $b$`0`
#> [1] "No"

df %>%
  mutate(across(everything(), as.character)) %>%
  mutate(
    a = a %>% recode(!!!meta$a),
    b = b %>% recode(!!!meta$b)
  ) %>%
  type_convert()
#> 
#> ── Column specification ────────────────────────────────────────────────────────
#> cols(
#>   a = col_character(),
#>   b = col_character()
#> )
#>     a   b
#> 1 one  No
#> 2 two Yes
#> 3 one  No

However, it can be very confusing if the type of the column changes, but not it's name. This is why I would do this instead:

df %>% mutate(a_char = a %>% recode(!!!meta$a))
  •  Tags:  
  • r
  • Related