Home > Blockchain >  How to produce a time serie with geom_bar COUNTING INSTANCES in categoric variable?
How to produce a time serie with geom_bar COUNTING INSTANCES in categoric variable?

Time:02-11

My data frame (s1) has this structure:

PERIOD : Date, format: "2019-02-01" "2019-02-01" "2019-02-01" "2019-02-01" ...

OPERATION : chr "SALE" "SALE" "SALE" "SALE"…

My goal: creating a plot showing the number of units sold each month between 2019-2021

When I try to produce it by means of ggplot :

ggplot(s1, aes(x= PERIOD, y= factor(OPERATION)))  
  geom_bar(stat="count", width=0.7, fill="steelblue") 
  theme_minimal()

I get this message:

Error: stat_count() can only have an x or y aesthetic

My questions:

  1. How can I manage to transfor “SALE” which is a categoric variable to become numeric in order to plot the time serie?
  2. Can you envisage a better solution to produce the desired plot?

Thanks in advance for your advices

CodePudding user response:

How can I manage to transfor “SALE” which is a categoric variable to become numeric in order to plot the time serie?

In this case, you can change the parameter to stat or just use only one aes. If you use the x axis, then there is no need to do the counting on y axis, it's automatically computed by the graph using absolute frequency of x.

df %>% ggplot(aes(x= PERIOD))  
  geom_bar(stat="count", width=0.7, fill="steelblue")  
  theme_minimal()

Can you envisage a better solution to produce the desired plot?

I believe here you aim a time series plot. I gave you a code for it. Assuming OPERATION is a character variable which can have both letters and numbers, first we convert to factor and then to numeric. After that, we are ready to plot within y axis. Edit: I forgot to add the filter for the requested dates!

df <- df %>% filter(between(PERIOD,"01/01/2019","01/01/2021"))

df %>% ggplot(aes(x= PERIOD, y= as.numeric(as.factor(OPERATION))))  
geom_line()  
  theme_minimal()  
  labs(y="Operations Number")

CodePudding user response:

I made some toy data and created a month variable that distinguishes only month and year but not the day:

s1 <- data.frame(PERIOD = as.Date(c("2019-02-01", "2019-02-01", "2019-03-01", "2019-02-01", 
                                    "2020-02-01", "2020-03-01", "2020-05-01", "2020-02-01")),
                 OPERATION = as.character(c("SALE", "SALE", "SALE", "SALE", "SALE", "SALE", "SALE", "SALE")))


s1$month <- paste(months(s1$PERIOD), format(as.Date(s1$PERIOD, format="%d/%m/%Y"),"%Y"), sep = "_")

library(ggplot2)

ggplot(s1, aes(x= month)) 
  geom_bar(stat="count", width=0.7, fill="steelblue") 
  theme_minimal()

Here is the output (I am sorry it is in German that`s my system language): enter image description here

  • Related