I have data as follows, and a problem I regretfully don't seem to be able to reproduce:
dat <- structure(c(1, NA_real_), format.stata = "%8.0g", labels = c(female = 1,
male = 2), class = c("haven_labelled", "vctrs_vctr", "double"
))
dat <- data.frame(dat)
lapply(dat, class)
[1] "haven_labelled" "vctrs_vctr" "double"
I would like to remove the custom labels and I tried a couple of the following things:
clear.labels <- function(x) {
if(is.list(x)) {
for(i in seq_along(x)) {
class(x[[i]]) <- setdiff(class(x[[i]]), 'labelled')
attr(x[[i]],"label") <- NULL
}
} else {
class(x) <- setdiff(class(x), "labelled")
attr(x, "label") <- NULL
}
return(x)
}
dat <- clear.labels(dat)
However this does not work because the class is haven_labelled
. Obviously I could change that, but I would rather have something that works independent of name.
lapply(dat, class)
$dat
[1] "haven_labelled" "vctrs_vctr" "double"
I also tried:
dat <- data.frame(lapply(dat, unclass))
lapply(dat, class)
$dat
[1] "numeric"
For my actual data however, it does not seems to work, even though it has exactly the same data.
Are there any other options I could try?
EDIT: Would it not be a possibility to simply make the last class the only class?
CodePudding user response:
Use haven’s zap_*()
functions:
library(haven)
zapped <- dat |>
zap_labels() |>
zap_formats()
zapped
# dat
# 1 1
# 2 NA
class(zapped$dat)
# "numeric"