I have a large dataframe, where I calculated the mean based on a given tag. I used this for a scatterplot, but I need to add errorbars given by standard deviation. Any way to do this via setDT, since this is where I calculated my mean?
The code I'm talking about is:
setDT(df)[, lapply(.SD, mean, na.rm=TRUE), keyby = tag]
CodePudding user response:
From your question, I understand you intend to use the data.table package.
setDT
just converts a standard data.frame
to a data.table
.
A possible solution, with the iris
dataset as example:
library(data.table)
# take a copy of iris
df <- copy(iris)
# rename "Species" to "tag"
setnames(df,"Species","tag")
result <- setDT(df)[, c(lapply(.SD, mean, na.rm=TRUE),
lapply(.SD, sd, na.rm=TRUE)), keyby = tag]
# Rename columns
old <- setdiff(colnames(df),"tag")
new <- c("tag",paste0(old,".mean"),paste0(old,".sd"))
setnames(result, new)
result
#> Key: <tag>
#> tag Sepal.Length.mean Sepal.Width.mean Petal.Length.mean
#> <fctr> <num> <num> <num>
#> 1: setosa 5.006 3.428 1.462
#> 2: versicolor 5.936 2.770 4.260
#> 3: virginica 6.588 2.974 5.552
#> Petal.Width.mean Sepal.Length.sd Sepal.Width.sd Petal.Length.sd
#> <num> <num> <num> <num>
#> 1: 0.246 0.3524897 0.3790644 0.1736640
#> 2: 1.326 0.5161711 0.3137983 0.4699110
#> 3: 2.026 0.6358796 0.3224966 0.5518947
#> Petal.Width.sd
#> <num>
#> 1: 0.1053856
#> 2: 0.1977527
#> 3: 0.2746501