Home > Software engineering >  how to use ggplot to visualise grouped data
how to use ggplot to visualise grouped data

Time:04-10

I have the following code which generate the following results:

years <- seq(1930, 2020, by = 10)
length(years)
years
labels <- paste(1   years[-length(years)], years[-1], sep = "-")
length(labels)
labels
SP500 %>% mutate(decade = cut(SP500$Year,seq(1930,2020,by=10), labels = labels)) %>% 
  group_by(decade) %>% summarise(return = mean(`Annual\n% Change`))
# A tibble: 10 × 2
   decade    return
   <fct>      <dbl>
 1 1931-1940  0.014
 2 1941-1950  0.077
 3 1951-1960  0.124
 4 1961-1970  0.056
 5 1971-1980  0.058
 6 1981-1990  0.098
 7 1991-2000  0.157
 8 2001-2010  0.018
 9 2011-2020  0.121
10 NA         0.04 

and my question is how can I put this result into a bar or line chart? I have been trying to do that for the last few hours but I keep getting errors, although I feel the answer is simple it seems like I just can't see it

years <- seq(1930, 2020, by = 10)
length(years)
labels <- paste(1   years[-length(years)], years[-1], sep = "-")
length(labels)
SP500 %>% mutate(decade = cut(SP500$Year,seq(1930,2020,by=10), labels = labels)) %>% 
  group_by(decade) %>% summarise(return = mean(`Annual\n% Change`)) %>%
  ggplot(SP500, aes(x = decade, y = return))  
  geom_col()

Error in ggplot(): ! Mapping should be created with aes() or aes_().

SP500 %>% ggplot(
  ss <- SP500 %>% mutate(decade = cut(SP500$Year,seq(1930,2020,by=10))) %>% 
  group_by(decade) %>% summarise(return = mean(`Annual\n% Change`)), aes_(x=ss[,1], y= ss[,2]))    geom_line()

many thanks in advance

CodePudding user response:

You are piping the result of your data manipulation through to ggplot, but also passing the name of the data frame as the first argument to ggplot.

Remember that doing

data_frame %>% ggplot(aes(x, y))

Is the same as doing

ggplot(data = data_frame, mapping = aes(x, y))

But doing

data_frame %>% ggplot(data_frame, aes(x, y))

Is the same as doing

ggplot(data = data_frame, mapping = data_frame, aes(x, y))

And of course, you get an error because you can't pass a data frame to the mapping argument.

So you can do

SP500 %>% 
  mutate(decade = cut(SP500$Year,seq(1930,2020,by=10), labels = labels)) %>% 
  group_by(decade) %>% 
  summarise(return = mean(`Annual\n% Change`)) %>%
  ggplot(aes(x = decade, y = return))  
  geom_col()

or

SP500 <- SP500 %>% 
  mutate(decade = cut(SP500$Year,seq(1930,2020,by=10), labels = labels)) %>% 
  group_by(decade) %>% 
  summarise(return = mean(`Annual\n% Change`))

ggplot(SP500, aes(x = decade, y = return))  
  geom_col()

Both of which result in:

enter image description here

The above plot was made with the following code which includes the data taken from your question. If you copy and paste this code block into your R console, it will produce the same plot:

SP500 <- structure(list(decade = structure(1:9, .Label = c("1931-1940", 
"1941-1950", "1951-1960", "1961-1970", "1971-1980", "1981-1990", 
"1991-2000", "2001-2010", "2011-2020"), class = "factor"), return = c(0.014, 
0.077, 0.124, 0.056, 0.058, 0.098, 0.157, 0.018, 0.121)), row.names = c(NA, 
-9L), class = c("tbl_df", "tbl", "data.frame"))

library(ggplot2)
ggplot(SP500, aes(decade, return))   geom_col()

  • Related