Home > Back-end >  Select 90% of the sample from the birthwt data randomly
Select 90% of the sample from the birthwt data randomly

Time:01-25

library(MASS)
bwt11<-birthwt

How to select 90% of the sample from the bwtdata randomly using this R command?

y1<-sample(bwt11,0.9,replace=TRUE)

CodePudding user response:

Let's check the number of rows that this data frame has:

library(MASS)

nrow(birthwt)
#> [1] 189

Obviously we can't get exactly 90% of this because 90% of 189 isn't an integer. The closest we can get is:

round(nrow(birthwt) * 0.9)
#> [1] 170

So, we want to take 170 rows at random from this data frame. To do this, we can use sample to get a vector of 170 random integers between 1 and 189, and subset the data frame by this vector:

birthwt[sample(189, 170, replace = FALSE),]

If you want to ensure that you are taking these values directly from source, you can change the code to:

birthwt[sample(nrow(birthwt), round(nrow(birthwt) * 0.9), replace = FALSE),]

CodePudding user response:

If you want to select 90% of your sample then setting replace = TRUE does not make much sense because you mostly probably would select some observations multiple times.

Since bwt11 is a dataframe, then you can get the number of rows and select 90% of that:

bwt11 <- birthwt
n_bwt11 <- nrow(bwt11)
samp_percentile <- 0.9
samp_size <- round(samp_percentile * n_bwt11)
y1 <- sample(n_bwt11, samp_size, replace = FALSE)
bwt11_samp <- bwt11[y1, ]
dim(bwt11)
[1] 189  10
dim(bwt11_samp)
[1] 170  10
  • Related