Is there any R function which could give me directly the same output of proc tabulate ??
var1<-c(rep("A",4),rep("B",4))
var2<-c(rep("C",4),rep("D",4))
var3<-c(rep("E",2),rep("F",4),rep("G",2))
dataset<-data.frame(var1,var2,var3)
proc tabulate data=dataset;
class var1 var2 var3;
table var1*var2 ,var3 all (n rowpctn);
run;
The output that I want is like this:
CodePudding user response:
Here is a way with R
-
- Create a column of 1s -
n
- Expand the data to fill the missing combinations -
complete
- Reshape to 'wide' format -
pivot_wider
- Create the 'Total' column by getting the row wise sum -
rowSums
- Add the percentage by looping
across
the 'var3' columns
library(dplyr)
library(tidyr)
library(stringr)
dataset %>%
mutate(n = 1, var3 = str_c('var3_', var3)) %>%
complete(var1, var2, var3, fill = list(n = 0)) %>%
pivot_wider(names_from = var3, values_from = n, values_fn = sum) %>%
mutate(Total = rowSums(across(where(is.numeric)))) %>%
group_by(var1) %>%
mutate(across(starts_with('var3'),
~ case_when(. == 0 ~ '0(0%)',
TRUE ~ sprintf('%d(%d%%)', ., 100 * mean(. != 0))))) %>%
ungroup
-output
# A tibble: 4 × 6
var1 var2 var3_E var3_F var3_G Total
<chr> <chr> <chr> <chr> <chr> <dbl>
1 A C 2(50%) 2(50%) 0(0%) 4
2 A D 0(0%) 0(0%) 0(0%) 0
3 B C 0(0%) 0(0%) 0(0%) 0
4 B D 0(0%) 2(50%) 2(50%) 4