Checking data type for each variable in a R data frame-CodePudding

I wondered what causes the differences between these 2 commands:

apply(DF,2,is.numeric)
c(ID = FALSE, diet = FALSE, height = FALSE, weight = FALSE, gender = FALSE, 
wohn = FALSE, social = FALSE, alter = FALSE, d13C = FALSE, d15N = FALSE, 
ferr = FALSE, VitB = FALSE)

sapply(DF,is.numeric)
c(ID = FALSE, diet = FALSE, height = TRUE, weight = TRUE, gender = FALSE, 
wohn = FALSE, social = FALSE, alter = TRUE, d13C = TRUE, d15N = TRUE, 
ferr = TRUE, VitB = TRUE)

I thought I could use the first one for data frames too? Many thanks

CodePudding user response：

apply changes the data to matrix first and since matrix can hold data of only one type if the dataframe has mixed class (numeric, character) it changes the numeric columns to character values thus returning FALSE for is.numeric.

Here's an example to demonstrate what you are observing.

DF <- data.frame(a = 1:5, b = letters[1:5])
apply(DF, 2, is.numeric)

#    a     b 
#FALSE FALSE 

sapply(DF, is.numeric)

#    a     b 
# TRUE FALSE

In contrast, if all the columns of dataframe is numeric apply will return TRUE.

DF <- data.frame(a = 1:5, b = 1:5)
apply(DF, 2, is.numeric)

#   a    b 
#TRUE TRUE