Essentially what I want to do is only use the summary()
function in r
only on specific columns of my df.
Basically doing this (using the cars df as an example):
cars_summary <- summary(cars)
speed_summary <- cars_summary$speed
When I try to do this I get an error saying:
$ operator is invalid for atomic vectors
What does that mean and is there a way to do this without sapply()
?
Thanks!
CodePudding user response:
Up front, I think you can do just summary(cars$speed)
to get what you want.
summary(cars$speed)
# Min. 1st Qu. Median Mean 3rd Qu. Max.
# 4.0 12.0 15.0 15.4 19.0 25.0
If you will want this for multiple columns, and speed
is just one example, then try this:
cars_summary <- lapply(cars, summary)
cars_summary$speed
# Min. 1st Qu. Median Mean 3rd Qu. Max.
# 4.0 12.0 15.0 15.4 19.0 25.0
However, there are some other things going on.
The column names of the summary matrix are buffered/padded so that the names appear to be centered over the stats. This is purely aesthetic, but they do make it a little unpredictable (well, not-easy) to capture.
cars_summary <- summary(cars) dimnames(cars_summary) # [[1]] # [1] "" "" "" "" "" "" # [[2]] # [1] " speed" " dist"
It's a
matrix
, so you cannot use$
-indexing on it. One would instead need to use[,"speed"]
or whatever.cars_summary[," speed"] # # "Min. : 4.0 " "1st Qu.:12.0 " "Median :15.0 " "Mean :15.4 " "3rd Qu.:19.0 " "Max. :25.0 " ### or perhaps colnames(cars_summary) <- names(cars) cars_summary[,"speed"] # # "Min. : 4.0 " "1st Qu.:12.0 " "Median :15.0 " "Mean :15.4 " "3rd Qu.:19.0 " "Max. :25.0 "
Hrrmmm, it's a matrix, but it's a matrix of strings, as you can see above and here:
### back to the original cars_summary str(cars_summary) # 'table' chr [1:6, 1:2] "Min. : 4.0 " "1st Qu.:12.0 " "Median :15.0 " "Mean :15.4 " "3rd Qu.:19.0 " "Max. :25.0 " ... # - attr(*, "dimnames")=List of 2 # ..$ : chr [1:6] "" "" "" "" ... # ..$ : chr [1:2] " speed" " dist"
While one could certainly use some patterns or such to extract those numbers, there will be loss of precision/accuracy.