I have a dataframe, df:
df=data.frame("temp"=c(60.80,46.04,26.96,24.98),"humid"=c(93.79,53.33,50.34,54.65),"wind_speed"=c(40.27,39.12,14.96, 13.81), "date" =c("2013-01-01","2013-01-03","2013-02-01", "2013-02-02"))
df$date <- as.Date(df$date, "%Y-%m-%d")
temp humid wind_speed date
1 60.80 93.79 40.27 2013-01-01
2 46.04 53.33 39.12 2013-01-03
3 26.96 50.34 14.96 2013-02-01
4 24.98 54.65 13.81 2013-02-02
I have transformed it to look like this using this line:
df_mod<- cbind(df[4], stack(df_w_delays_mod[1:3]))
metric values date
temp 60.80 2013-01-01
temp 46.04 2013-01-03
temp 26.96 2013-02-01
temp 24.98 2013-02-02
humid 93.79 2013-01-01
humid 53.33 2013-01-03
humid 50.34 2013-02-01
humid 54.65 2013-02-02
wind_speed 40.27 2013-01-01
wind_speed 39.12 2013-01-03
wind_speed 14.96 2013-02-01
wind_speed 13.81 2013-02-02
then I have extracted the month with:
transform(df, month = month(date, label=TRUE))
metric values month
temp 60.80 Jan
temp 46.04 Jan
temp 26.96 Feb
temp 24.98 Feb
...
now I am trying to build a chart similar to this.
I want to have the mean values for the height of each bar. So I want to group by month and by variable, and then take the average value within each month.
I'm trying this code but it gives me errors.
df_mod %>%
group_by(metric) %>%
summarize(mean= mean(values)) %>%
ggplot(aes(fill=metric, y=mean, x=month))
geom_bar(position="dodge", stat="identity")
theme_bw()
labs(title="Weather metrics",
x="", y = "values")
Don't know how to automatically pick scale for object of type function. Defaulting to continuous.
Error in `f()`:
! Aesthetics must be valid data columns. Problematic aesthetic(s): x = month.
Did you mistype the name of a data column or forget to add after_stat()?
Run `rlang::last_error()` to see where the error occurred.
I have also tried group_by(month, metric) and it says
`summarise()` has grouped output by '.groups'. You can override using the `.groups` argument.
Can someone help me with this?
CodePudding user response:
The issue is that you have to group_by
metric
and month
. Otherwise you are only computing the mean per metric
and as a result the month
column gets dropped. That's the reason for the error you get as ggplot2
now thinks you want to map the month
function on x
.
Note: I also switched to the tidyverse
way to reshape your data.
library(tidyr)
library(ggplot2)
library(lubridate)
library(dplyr)
df_mod <- df %>%
tidyr::pivot_longer(-date, names_to = "metric", values_to = "values") %>%
mutate(month = month(date, label = TRUE))
df_mod %>%
group_by(month, metric) %>%
summarize(mean = mean(values)) %>%
ggplot(aes(fill = metric, y = mean, x = month))
geom_col(position = "dodge")
theme_bw()
labs(
title = "Weather metrics",
x = "", y = "values"
)