I have a sample dataframe as below:
self race1 race2 race3 race4
1 1 2 2 1
2 1 1 1 1
3 1 3 1 1
4 2 1 3 1
I would like to get the proportion of 1s in the race columns as a new column. So for each row, I would count the number of 1 and divide it by 4. The desired output dataframe would look like below.
self race1 race2 race3 race4 prop_race_as1
1 1 2 2 1 2/4
2 1 1 1 1 4/4
3 1 3 1 1 3/4
4 2 1 3 1 2/4
How do I write a function that incorporate rowwise()
to get the desired output?
CodePudding user response:
Assuming your data is in df
, you can get ratios as
ratios <- apply(data.matrix(df)[,-1], 1, function(x) length(which(x == 1)) / (ncol(df)-1))
then cbind(df, ratios)
.
CodePudding user response:
Please find below two possibilities.
Reprex
1. With dplyr
(and rowwise()
)
- Code
library(dplyr)
df %>%
dplyr::rowwise() %>%
dplyr::mutate(prop_race_as1 = sum(c_across(starts_with("race")) < 2) / 4)
- Output
#> # A tibble: 4 x 6
#> # Rowwise:
#> self race1 race2 race3 race4 prop_race_as1
#> <int> <int> <int> <int> <int> <dbl>
#> 1 1 1 2 2 1 0.5
#> 2 2 1 1 1 1 1
#> 3 3 1 3 1 1 0.75
#> 4 4 2 1 3 1 0.5
2. Using only base R
- Code
df$prop_race_as1 <- rowSums(df[startsWith(names(df), "race")] < 2) / 4
- Output
df
#> self race1 race2 race3 race4 prop_race_as1
#> 1 1 1 2 2 1 0.50
#> 2 2 1 1 1 1 1.00
#> 3 3 1 3 1 1 0.75
#> 4 4 2 1 3 1 0.50
Data
df <- structure(list(self = 1:4, race1 = c(1L, 1L, 1L, 2L), race2 = c(2L,
1L, 3L, 1L), race3 = c(2L, 1L, 1L, 3L), race4 = c(1L, 1L, 1L,
1L)), class = "data.frame", row.names = c(NA, -4L))
Created on 2022-02-16 by the reprex package (v2.0.1)