Wondering how I can create a table with the results of different calculations in R. Here's an example using mtcars df
# Load tidyverse
library(tidyverse)
# Select only cyl vs and am from mtcars for simplicity
mt_select <- select(mtcars, c(cyl, vs, am))
# Here is the table I wish to make using some type of looping function
# (I have like 40 variables in actual dataset)
freq_table <- mt1 %>% group_by(cyl) %>%
summarise(n = n(),
vs_sum = sum(vs),
vs_perc = sum(vs)/n*100,
am_sum = sum(am),
am_perc = sum(am)/n*100)
print(freq_table)
Here is my attempt. Main issues I can't figure out:
- Includes sums for "0" answers,
- I don't know how to add a percent column
- Don't know how to merge these all into one table
# make a vector to loop through
mt_vars <- names(mt_select)
# Loop to make tables
for (i in mt_vars){
mt_select %>%
group_by(cyl) %>%
count_(i) %>%
print()
}
Been trying to figure out how to make this for months and always just decide I don't need the table, or something else. Any help is greatly appreciated!!
CodePudding user response:
You haven't provided an expected output, but I think the key to solving your problem might be to pivot your data into long format. This will probably allow you to do what you want without any need for loops. For example, taking mtcars
as input:
library(tidyverse)
mtcars %>%
pivot_longer(everything()) %>%
group_by(name) %>%
summarise(
n=n(),
valueSum=sum(value),
valuePct=valueSum/(n*100)
)
# A tibble: 11 × 4
name n valueSum valuePct
<chr> <int> <dbl> <dbl>
1 am 32 13 0.00406
2 carb 32 90 0.0281
3 cyl 32 198 0.0619
4 disp 32 7383. 2.31
5 drat 32 115. 0.0360
6 gear 32 118 0.0369
7 hp 32 4694 1.47
8 mpg 32 643. 0.201
9 qsec 32 571. 0.178
10 vs 32 14 0.00438
11 wt 32 103. 0.0322
Does that come close to what you want? If you want to work on only a subset of columns, either substitute what you need in place of everything()
, filter
after pivoting or select
before pivoting.
Also, I'm not sure what sort of a percentage you're trying to calculate.
value
and name
are the default values of names_to
and values_to
in pivot_longer()
.