Home > Software engineering >  how to get the output of proc tabulate (SAS) in R
how to get the output of proc tabulate (SAS) in R

Time:12-06

Is there any R function which could give me directly the same output of proc tabulate ??

var1<-c(rep("A",4),rep("B",4))
var2<-c(rep("C",4),rep("D",4))
var3<-c(rep("E",2),rep("F",4),rep("G",2))
dataset<-data.frame(var1,var2,var3)
proc tabulate data=dataset;
class var1 var2 var3;
table var1*var2 ,var3 all (n rowpctn);
run;

The output that I want is like this:

enter image description here

CodePudding user response:

Here is a way with R -

  1. Create a column of 1s - n
  2. Expand the data to fill the missing combinations - complete
  3. Reshape to 'wide' format - pivot_wider
  4. Create the 'Total' column by getting the row wise sum - rowSums
  5. Add the percentage by looping across the 'var3' columns
library(dplyr)
library(tidyr)
library(stringr)
dataset %>%
    mutate(n = 1, var3 = str_c('var3_', var3)) %>% 
    complete(var1, var2, var3, fill = list(n = 0)) %>% 
    pivot_wider(names_from = var3, values_from = n, values_fn = sum) %>% 
    mutate(Total = rowSums(across(where(is.numeric)))) %>% 
    group_by(var1) %>% 
    mutate(across(starts_with('var3'), 
     ~ case_when(. == 0 ~ '0(0%)', 
       TRUE ~ sprintf('%d(%d%%)', .,  100 * mean(. != 0))))) %>% 
    ungroup

-output

# A tibble: 4 × 6
  var1  var2  var3_E var3_F var3_G Total
  <chr> <chr> <chr>  <chr>  <chr>  <dbl>
1 A     C     2(50%) 2(50%) 0(0%)      4
2 A     D     0(0%)  0(0%)  0(0%)      0
3 B     C     0(0%)  0(0%)  0(0%)      0
4 B     D     0(0%)  2(50%) 2(50%)     4
  •  Tags:  
  • r
  • Related