I have a data frame where all columns have the character class. I want to automatically convert the classes of each column to the class that fits the data "best".
Consider the following example data:
data <- data.frame(x1 = letters[1:5],
x2 = as.character(1:5),
x3 = as.character(seq(0.2, 1, 0.2)))
data
x1 x2 x3
1 a 1 0.2
2 b 2 0.4
3 c 3 0.6
4 d 4 0.8
5 e 5 1
All columns in our example data have the character class:
sapply(data, class)
# x1 x2 x3
# "character" "character" "character"
I could convert each column to the desired class manually. However, for large data sets this might not be efficient.
Is there a way to automatically scan the values in each column and convert the corresponding column to a better class?
In this example, the column x2 contains integers and the column x3 contains numericals. The desired classes would hence look like this:
sapply(data, class)
# x1 x2 x3
# "character" "integer" "numeric"
CodePudding user response:
Using type.convert()
, the as.is=TRUE
prevents from coercing characters to factors.
data <- data |> type.convert(as.is=TRUE)
str(data)
# 'data.frame': 5 obs. of 3 variables:
# $ x1: chr "a" "b" "c" "d" ...
# $ x2: int 1 2 3 4 5
# $ x3: num 0.2 0.4 0.6 0.8 1
R < 4.1:
data <- type.convert(data, as.is=TRUE)