Home > Mobile >  Pivoting CreateTableOne in R to show levels as column headers?
Pivoting CreateTableOne in R to show levels as column headers?

Time:03-16

I'm trying to generate some descriptive summary tables in R using the CreateTableOne function. I have a series of variables that all have the same response options/levels (Yes or No), and want to generate a wide table where the levels are column headings, like this:

Variable Yes No
Var1 1 7
Var2 5 2

But CreateTableOne generates nested long tables, with one column for Level where Yes and No are values in rows, like this:

Variable Level Value
Var1 Yes 1
Var1 No 7

Is there a way to pivot the table to get what I want while still using this function, or is there a different function I should be using instead?

Here is my current code:

    vars <- c('var1', 'var2')
    Table <- CreateTableOne(vars=vars, data=dataframe, factorVars=vars)
    Table_exp <- print(Table, showAllLevels = T, varLabels = T, format="f", test=FALSE, noSpaces = TRUE, printToggle = FALSE)
    write.csv(Table_exp, file = "Table.csv")

Thanks!

CodePudding user response:

You could use only the pivot_wider to make that table. Here is your data:

library(tidyverse)
  dataframe = data.frame(Variable = c("Var1", "Var1", "Var2", "Var2"),
                         Level = c("Yes", "No", "Yes", "No"),
                         Value = c(1, 7, 5, 2))

Your data:

  Variable Level Value
1     Var1   Yes     1
2     Var1    No     7
3     Var2   Yes     5
4     Var2    No     2

You can use this code to make the wider table:

dataframe %>% 
    pivot_wider(names_from = "Level", values_from = "Value")

Output:

# A tibble: 2 × 3
  Variable   Yes    No
  <chr>    <dbl> <dbl>
1 Var1         1     7
2 Var2         5     2

CodePudding user response:

So I got the answer to this question from a coworker, and it's very similar to what Quinten suggested but with some additional steps to account for the structure of my raw data.

The example tables I provided in my question were my desired outputs, not examples of my raw data. The number values weren't values in my dataset, but rather calculated counts of records, and the solution below includes steps for doing that calculation.

This is what my raw data looks like, and it's actually structured wide:

Participant_ID Var1 Var2 Age
1 Yes No 20
2 No No 30

We started by creating a subset with just the relevant variables:

subset <- data |> select(Participant_Id, Var1, Var2)

Then pivoted the data longer first, in order to calculate the counts I wanted in my output table. In this code, we specify that we don't want to pivot Participant_ID and create columns called Vars and Response.

subsetlong <- subset |> pivot_longer(-c("Participant_Id"), names_to = "Vars", values_to="Response")

This is what subsetlong looks like:

Participant_ID Vars Response
1 Var1 Yes
1 Var2 No
2 Var1 No
2 Var2 No

Then we calculated the counts by Vars, putting that into a new dataframe called counts:

counts <- subsetlong |> group_by(Vars) |> count(Response)

And this is what counts looks like:

Vars Response n
Var1 Yes 1
Var1 No 7
Var2 Yes 5
Var2 No 2

Now that the calculation was done, we pivoted this back to wide again, specifying that any NAs should appear as 0s:

counts_wide <- counts |> pivot_wider(names_from="Response", values_from="n", values_fill = 0)

And finally got the desired structure:

Vars Yes No
Var1 1 7
Var2 5 2
  • Related