I have a dataset which has the following structure < dput(head(df)) > :
structure(list(type_de_sejour = c("Amb", "Hosp",
"Hosp", "Amb", "Hosp", "Sea"),
specialite = c("ANES", "ANES",
"Autres", "CARD", "CARD", "CARD"
), CA_annee_N = c(2712L, 122180L, 0L, 822615L, 6905494L,
0L), nb_sejours_N = c(8L, 32L, 0L, 1052L, 2776L, 0L), nb_doc_N = c(5L,
8L, 0L, 12L, 15L, 0L), CA_annee_N1 = c(4231L, 78858L, 6587L,
327441L, 6413083L, 0L), nb_sejours_N1 = c(13L, 29L, 2L, 532L,
2819L, 0L), nb_doc_N1 = c(6L, 9L, 1L, 12L, 12L, 0L
), CA_annee_N2 = c(4551L, 27432L, 0L, 208326L, 7465440L,
575L), nb_sejours_N2 = c(15L, 8L, 0L, 463L, 3393L, 1L), nb_doc_N2 = c(6L,
4L, 0L, 11L, 13L, 1L), site = c("FR", "FR", "FR", "FR",
"FR", "FR")), row.names = c(NA, 6L), class = "data.frame")
I am trying to plot a graph showing the percentage each "specialite" (distinguishing per "site", ideally by faceting or doing 2 plots, one per site) represents in the total "nb_sejours_N", after having filtered by type_de_sejour == "Amb".
I have tried the following code :
df %>%
mutate(volume_N == nb_sejours_N,
volume_N1 == nb_sejours_N1,
volume_N2 == nb_sejours_N2)%>%
filter(type_de_sejour == "Amb")%>%
group_by(site) %>%
mutate(proportion_N = volume_N/sum(volume_N, na.rm = TRUE),
proportion_N1 = volume_N1/sum(volume_N1, na.rm = TRUE),
proportion_N2 = volume_N2/sum(volume_N2, na.rm = TRUE))
Unfortunately, it doesn't work, so I can't go any further. I would also like to know if anyone knows an efficient code to plot what I'm trying to represent ?
CodePudding user response:
I believe the following works:
# creating plot
p = df %>% filter(type_de_sejour == "Amb") %>%
pivot_longer(cols = c("nb_sejours_N","nb_sejours_N1","nb_sejours_N2"), values_to = "visit") %>%
ggplot(aes(fill=name, y=visit, x=name)) geom_bar(position="stack", stat="identity")
# creating summary of totals for each column
totals = df %>% filter(type_de_sejour == "Amb") %>%
pivot_longer(cols = c("nb_sejours_N","nb_sejours_N1","nb_sejours_N2"), values_to = "visit") %>%
group_by(name) %>% summarise(total = sum(visit))
# adding totals on top of bars to plot
p geom_text(aes(name, total, label = total, fill = NULL), data = totals)