i want to plot raw birth counts by month, stratified by year (month # is on the x axis, whereas the raw counts is on the y axis; there should be multiple lines representing each year).
Here is the structure of my data:
> dput(births_monthly_cross_022822_clean)
structure(list(month = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12
), `2010` = c(323816, 302551, 339219, 325582, 328960, 335180,
345875, 350473, 351439, 337477, 326868, 339665), `2011` = c(321104,
298537, 330752, 313875, 327242, 337890, 346192, 360079, 346254,
329174, 322135, 327986), `2012` = c(316959, 305060, 324944, 307262,
330779, 327811, 348157, 361902, 340901, 346251, 325810, 324960
), `2013` = c(324314, 292360, 321166, 312274, 330193, 320475,
349920, 354293, 338863, 341353, 319233, 336320), `2014` = c(327154,
299087, 324317, 319788, 335444, 326757, 356446, 355201, 349403,
344265, 318819, 341494), `2015` = c(326747, 298815, 329714, 321618,
328709, 331400, 354384, 352782, 348479, 339904, 319605, 336576
> head(births_monthly_cross_022822_clean)
# A tibble: 6 × 12
month `2010` `2011` `2012` `2013` `2014` `2015` `2016` `2017` `2018` `2019` `2020`
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 1 323816 321104 316959 324314 327154 326747 317445 314597 315593 311678 305536
2 2 302551 298537 305060 292360 299087 298815 306750 289694 284940 280679 283385
3 3 339219 330752 324944 321166 324317 329714 329341 320327 316824 304999 302331
4 4 325582 313875 307262 312274 319788 321618 314312 300801 299125 299755 290940
5 5 328960 327242 330779 330193 335444 328709 328434 323169 321448 317160 301902
6 6 335180 337890 327811 320475 326757 331400 333166 324633 315585 304843 302574
Thank you!
CodePudding user response:
Currently your data frame is in wide format, where the variable you wish to plot on the y axis (counts of births) is represented in several columns. It is much easier to plot if you reshape your data to long format, where you have a column for counts and a column that labels according to year. You can achieve this with pivot_longer
from the tidyr
package.
It also makes for an easier-to-understand plot if you convert month numbers into month names.
Here's how you can do all this within the tidyverse framework:
library(ggplot2)
library(dplyr)
library(tidyr)
births_monthly_cross_022822_clean %>%
pivot_longer(-1) %>%
mutate(year = substr(name, 2, 6),
month = factor(month.abb[month], month.abb)) %>%
ggplot(aes(month, value, color = year, group = year))
geom_line()
theme_bw()
CodePudding user response:
You need to pivot longer your data in this way:
library(tidyverse)
births_monthly_cross_022822_clean %>%
pivot_longer(!month, names_to = 'year', values_to = 'q_births')