I have example data as follows:
library(data.table)
sample <- fread("
A,0,L,2,2
B,4,K,3,0
")
names(sample) <- c("type", "value1", "cat", "value2", "value3")
I would like to add a random value, to every numerical value in the dataset. I would like that random value to have a mean of 0, and a standard deviation of 0.1 * each column mean
.
I made a start as follows, but I am getting a little bit confused as how to store the values and how to use the apply function correctly here.
fetch_cols <- which(sapply(dat, is.numeric))
lapply(sample[, fetch_cols], rnorm(1, 0, 0.1*mean(sample[, fetch_cols])))
Desired output would be something like:
library(data.table)
sample <- fread("
A,-0.2,L,1.9,2.15
B,4.2,K,3.1,-.05
")
CodePudding user response:
With dplyr
you can use across
:
library(dplyr)
sample %>%
mutate(across(where(is.numeric), \(x) x rnorm(n(), mean = 0, sd = 0.1 * mean(x))))
Or with data.table
:
library(data.table)
num_cols <- names(sample)[sapply(sample, is.numeric)]
sample[,
(num_cols) := lapply(.SD, \(x) x rnorm(.N, mean = 0, sd = 0.1 * mean(x))),
.SDcols = num_cols
]
CodePudding user response:
You could use lapply
to conditionally add the values you want:
sample[] <- lapply(sample, function(x) {
if(!is.numeric(x)) x else x rnorm(length(x), 0, 0.1 * mean(x))
})
#> type value1 cat value2 value3
#> 1: A 0.147823 L 2.324099 1.8397374
#> 2: B 4.077322 K 2.799110 0.0933251