Home > database >  Rounding numbers to the nearest decimal places in R
Rounding numbers to the nearest decimal places in R

Time:02-05

Consider the following data set:

df <- data.frame(id=1:10,
                 v1=c(2.35456185,1.44501001,2.98712312,0.12345123,0.96781234,
                      1.23934551,5.00212233,4.34120000,1.23443213,0.00112233),
           v2=c(0.22222222,0.00123456,2.19024869,0.00012000,0.00029848,
                0.12348888,0.46236577,0.85757000,0.05479729,0.00001202))

My intention is to round the values in v1 and v2 to the nearest one decimal place (10% of observation), two decimals (40% of observations), and three decimal places (50% of observations) randomly. I can use the round() function to round numbers to certain decimal places uniformly. In my case, however, it's not uniform. Thank you in advance!

Example of output needed (of course mine is not random):

id   v1    v2
 1   2.3   0.2
 2   1.45  0
 3   2.99  2.19
 4   0.12  0
 5   0.97  0
 6   1.239 0.123
 7   5.002 0.462
 8   4.341 0.858
 9   1.234 0.055
10   0.001 0

CodePudding user response:

Update: Addressing the probabilities:

library(dplyr)
    
df %>%
 rowwise() %>% 
 mutate(v2 = round(v1,sample(1:3, 1,  prob = c(0.1, 0.4, 0.5))))
      id      v1    v2
   <int>   <dbl> <dbl>
 1     1 2.35     2.35
 2     2 1.45     1.44
 3     3 2.99     2.99
 4     4 0.123    0.12
 5     5 0.968    1   
 6     6 1.24     1.24
 7     7 5.00     5.00
 8     8 4.34     4.34
 9     9 1.23     1.2 
10    10 0.00112  0   

Here we round row wise randomly between 1 and 3:

library(dplyr)

df %>% 
  rowwise() %>% 
  mutate(V1 = round(v2,sample(1:3, 1)))
      id      v1    V2
   <int>   <dbl> <dbl>
 1     1 2.35    2.36 
 2     2 1.45    1.44 
 3     3 2.99    2.99 
 4     4 0.123   0.123
 5     5 0.968   0.968
 6     6 1.24    1.24 
 7     7 5.00    5.00 
 8     8 4.34    4.34 
 9     9 1.23    1.23 
10    10 0.00112 0.001

CodePudding user response:

We may create a grouping with sample based on the probbablity, and then round the v1 column based on the value of the group

library(dplyr)
df %>%
  group_by(grp = sample(1:3, size = n(), replace = TRUE,
     prob = c(0.10, 0.4, 0.5))) %>% 
  mutate(v1 = round(v1, first(grp))) %>%
  ungroup %>% 
  select(-grp)

-output

# A tibble: 10 × 2
      id    v1
   <int> <dbl>
 1     1 2.36 
 2     2 1.44 
 3     3 2.99 
 4     4 0.123
 5     5 0.97 
 6     6 1.24 
 7     7 5.00 
 8     8 4.3  
 9     9 1.23 
10    10 0    

For multiple columns, use across to loop over

df %>%
   mutate(across(v1:v2, ~ round(.x, sample(1:3, size = n(),
    replace = TRUE, prob = c(0.10, 0.40, 0.50)))))

Or we pass the sampled output in digits argument of round directly

df$v1 <- with(df, round(v1, sample(1:3, size = nrow(df), 
    replace = TRUE, prob = c(0.10, 0.4, 0.5))))
  • Related