Home > Net >  Converting daily data to montly one
Converting daily data to montly one

Time:12-16

My data is formatted as follows, with the data being a character type, not a date type:

X   date
1   19460530
0   19460601
1   19460602
1   19460603
.   ...
.   ...
.   ...

What I would like to get is the ratio of X on a monthly basis. For example, if I have 20 1s and 30 0s for July 1946 and 40 1s and 40 0s in August of 1946, I would like the following output:

194607  0.4
194608 0.5

From such an output, I would like to put it into a line graph using ggplot2 (date x ratio of X). Because in geom_line, you should have a continuous variable, and if I used a format like 194607 or 194608, there would be a huge gap between December and January. How can I make a line graph using monthly data?

CodePudding user response:

ggplot is flexible to handle date objects on the x-axis without that 'jump/gap' you are worried about.

tribble(
  ~X, ~date,
  1,   19460530,
  0,   19460601,
  1,   19460602,
  1,   19460603
) -> df

df$date <- lubridate::ymd(df$date)

df %>%
  group_by(date) %>%
  mutate(proportion = X / sum(X)) -> df

ggplot(df, aes(x = date, y = proportion))   
  geom_line()

CodePudding user response:

hachiko, thanks for your prompt answer. I have two questions, though.

  1. You only group_by'd by date, but how can you aggregate by month? Don't you have to specify by month somewhere?

  2. Is "proportion = X / sum(X)) -> df" right? Summing X will count the number of 1s, then shouldn't it be in the numerator?

  • Related