Home > Back-end >  Convert tibble into a different shape
Convert tibble into a different shape

Time:10-11

I am using the data found here: https://www.kaggle.com/cdc/behavioral-risk-factor-surveillance-system. In my R studio, I have named the csv file, BRFSS2015. Below is the code I am trying to execute.

BRFSS2015%>%
filter(!is.na(WTKG3), CHCKIDNY %in% 1:2) %>%
mutate(have_kidney_disease = CHCKIDNY == 1, 
       no_kidney_disease = CHCKIDNY == 2,
       weight = WTKG3) %>%
group_by(have_kidney_disease, no_kidney_disease) %>%
summarize(mean_weight = mean(weight),
          sd_weight = sd(weight), 
          .groups = "drop")%>%
mutate_if(is.numeric, round, digits=2)

This is giving me output that is a 2x4 tibble. The columns are have_kidney_disease, no_kidney disease, mean_weight, and sd_weight. I want to convert this into a 4x4 format, where have_kidney_disease and no_kidney_disease are rows. How can I do this?

CodePudding user response:

This code will get you there. If you want to rename columns don't do it inside a mutate but use rename(weight = WTKG3) as a statement after you use select or filter. But as you don't actually use this column except as a replacement in the summarize function, there is no need to do this.

I used a case_when statement so you can see how it that works. For just 2 values it is simpler to use an if_else. Below I gave 2 different solutions, the first as you had your code, the second without filtering on CHCKIDNY to show you how the case_when deals with other values.

BRFSS2015%>%
  filter(!is.na(WTKG3), CHCKIDNY %in% 1:2) %>%
  mutate(kidney_disease = case_when(CHCKIDNY == 1 ~ "yes",
                                    CHCKIDNY == 2 ~ "no")) %>% 
  group_by(kidney_disease) %>% 
  summarize(mean_weight = mean(WTKG3),
            sd_weight = sd(WTKG3))



# A tibble: 2 × 3
  kidney_disease mean_weight sd_weight
  <chr>                <dbl>     <dbl>
1 no                   8081.     2152.
2 yes                  8392.     2363.

No filter on CHCKIDNY

BRFSS2015%>%
  filter(!is.na(WTKG3)) %>%
  mutate(kidney_disease = case_when(CHCKIDNY == 1 ~ "yes",
                                    CHCKIDNY == 2 ~ "no",
                                    TRUE ~ "unknown")) %>% 
  group_by(kidney_disease) %>% 
  summarize(mean_weight = mean(WTKG3),
            sd_weight = sd(WTKG3))

# A tibble: 3 × 3
  kidney_disease mean_weight sd_weight
  <chr>                <dbl>     <dbl>
1 no                   8081.     2152.
2 unknown              8263.     2252.
3 yes                  8392.     2363.
  • Related