Home > Enterprise >  How to label values in a column based on a dataframe "dictionary"
How to label values in a column based on a dataframe "dictionary"

Time:03-11

I have data like this

col1
33
924
33
12
924

and a dataframe like this

col1   col2
12     "London"
33     "Paris"
924    "Singapore"

How do I label the first dataframe based on the columns in the second dataframe using r's labelled?

I know that using val_labels() I can apply a value label to values in a column using:

val_labels(df$col1) <- c(London = 12, Paris = 33, Singapore = 924)

But I have 1000 different values and need an approach that allows me to use a dataframe to do it.

CodePudding user response:

You can try this, though there are likely more elegant solutions:

df <- data.frame(col1 = c(33, 924, 33, 12, 924))
dic <- data.frame(col1 = c(12, 33, 924),
                  col2 = c("London","Paris","Singapore"))

library(labelled)
ct <- 1
for(i in dic$col1){
  val_label(df, i) <- as.character(dic[ct,2])
  ct <- ct 1
}
str(df)
# > str(df)
# 'data.frame': 5 obs. of  1 variable:
#   $ col1: dbl lbl [1:5]  33, 924,  33,  12, 924
# ..@ labels: Named num  12 33 924
# .. ..- attr(*, "names")= chr [1:3] "London" "Paris" "Singapore"
# > 

CodePudding user response:

With val_labels:

library(labelled)

col1=c(33,924,33,12,924)

ref <- read.table(text='
col1   col2
12     "London"
33     "Paris"
924    "Singapore"',header=T)

val_labels(col1)<-setNames(ref$col1,ref$col2)

col1
#> <labelled<double>[5]>
#> [1]  33 924  33  12 924
#> 
#> Labels:
#>  value     label
#>     12    London
#>     33     Paris
#>    924 Singapore
  • Related