So essentially I am relatively new to R and my weak spot is writing as little code as possible. I always run into the same problem and I just can't seem to solve it with a loop or a function, so I'd love some help.
Let's say my df looks like this:
a = c(12, 9, 11, 17, 22)
b = c(8, 1, 9, 4, 15)
c = c(2, 4, 1, 8, 4)
d = c(2, 4, 1, 5, 3)
df = data.frame(a, b, c, d)
I want to calcucate the proportion of b, c and d of a and I want a new column for each of the outcomes. My code without functions etc looks like this:
df$c_p = round((df$c / df$a)*100, digits = 2)
df$d_p = round((df$d / df$a)*100, digits = 2)
What's the easiest way to get the same output I do without having to copypaste the code over and over again? My dataframe is much bigger in reality and it's time for me to learn how to do this more efficiently.
Thank you!
CodePudding user response:
You can take advantage of R's vectorization.
cols <- names(df)[-1]
#OR
#cols <- c('b', 'c', 'd')
df[paste0(cols, '_p')] <- round(df[cols]/df$a * 100, 2)
df
# a b c d b_p c_p d_p
#1 10 8 2 2 80.00 20.0 20.0
#2 23 1 4 4 4.35 17.4 17.4
#3 50 9 1 1 18.00 2.0 2.0
#4 7 4 8 5 57.14 114.3 71.4
#5 3 15 4 3 500.00 133.3 100.0
CodePudding user response:
An alternative (and elegant) solution is based on dplyr
:
library(dplyr)
df %>%
mutate(across(b:d), ./a*100) %>%
select(-a)
b c d
1 66.66667 16.666667 16.666667
2 11.11111 44.444444 44.444444
3 81.81818 9.090909 9.090909
4 23.52941 47.058824 29.411765
5 68.18182 18.181818 13.636364
or , with rounding:
df %>%
mutate(across(b:d), round(./a*100, 2)) %>%
select(-a)
EDIT:
To keep the original columns, use cbind
:
df %>%
mutate(across(b:d), round(./a*100, 2)) %>%
rename(b_p = b, c_p = c, d_p = d) %>%
select(-a) %>%
cbind(df, .)
a b c d b_p c_p d_p
1 12 8 2 2 66.67 16.67 16.67
2 9 1 4 4 11.11 44.44 44.44
3 11 9 1 1 81.82 9.09 9.09
4 17 4 8 5 23.53 47.06 29.41
5 22 15 4 3 68.18 18.18 13.64