Home > Software engineering >  Indexing a tibble makes data type undetectable
Indexing a tibble makes data type undetectable

Time:02-03

I have the following tibble:

test <- tibble(
  id = c("John","Jacob","Jingleheimer","Schmidt"),
  score = c(2,4,6,8)
)

The variable "score" is numeric. When I run the command is.numeric, I get the following:

> is.numeric(test$score)
[1] TRUE

But when I try to do the same thing, only this time referencing the column by its index, I get a different output:

> is.numeric(test[,2])
[1] FALSE

I'm confused as to why I'm getting such disparate output to two versions of the same command. Why can't is.numeric detect the data type when I use indexing?

CodePudding user response:

Indexing works differently for tibbles and data.frames. See this.

library(dplyr)
test <- tibble(
  id = c("John","Jacob","Jingleheimer","Schmidt"),
  score = c(2,4,6,8)
)
 
class(test[, 2])
## [1] "tbl_df"     "tbl"        "data.frame"

class(as.data.frame(test)[, 2])
## [1] "numeric"

The basic difference is that data frames default to drop = TRUE whereas tibbles default to drop = FALSE.

class(test[, 2])
## [1] "tbl_df"     "tbl"        "data.frame"

class(test[, 2, drop = FALSE]) # same
## [1] "tbl_df"     "tbl"        "data.frame"

class(test[, 2, drop = TRUE])
## [1] "numeric"


class(as.data.frame(test)[, 2])
## [1] "numeric"

class(as.data.frame(test)[, 2, drop = TRUE]) # same
## [1] "numeric"

class(as.data.frame(test)[, 2, drop = FALSE])
## [1] "data.frame"

Also note

class(pull(test, 2))
## [1] "numeric"

class(test[[2]])
## [1] "numeric"

class(unlist(test[, 2]))
## [1] "numeric"

sapply(test, class)
##          id       score 
## "character"   "numeric" 
  •  Tags:  
  • r
  • Related