Home > Mobile >  R - Plotting data from all dataframes in one graph
R - Plotting data from all dataframes in one graph

Time:11-22

I'm reading multiple files into a dataframe each. E.g. files for 'apples', 'oranges' and 'lemons'. Each of these files (=dataframes) contains data like this:

       Def      Q1-2021  Q2-2021  Q3-2021  ID
1      price    15       17       9        apples
2      stock    1270     3575     175      apples
3      revenue  57866    66287    21755    apples
4      profit   22156    34187    9253     apples

There are several more quarters of data, but you get the idea.

Now I'd like to make a graph that displays the price as a line chart for all dataframes (ie for apples, oranges, lemons, etc.).

I was considering rbind-ing all dataframes together, but since not all products have data for each quarter, that won't work.

Now I'm stuck and don't know how to get any further. Can someone help me out please?

CodePudding user response:

If you convert your data into "long" format, then it will be easy to rbind and plot as desired. I just made up some data for oranges to help illustrate.

First I would put each data.frame into a list and then you can apply()/purrr::map() the manipulations on all at the same time and then combine.

It will probably help to reformat your date information also to make it plot better so you'll see I put year before quarter, but there are many other good options here depending on the structure of your actual data and ultimate goals.

Finally, you mentioned only plotting price, but I'm showing here how to plot all the different value types in case that was of interest. If not, you can just un-comment the line that filter()s just to price.

library(tidyverse)

apples <- structure(list(Def = c("price", "stock", "revenue", "profit"), `Q1-2021` = c(15L, 1270L, 57866L, 22156L), `Q2-2021` = c(17L, 3575L, 66287L, 34187L), `Q3-2021` = c(9L, 175L, 21755L, 9253L), ID = c("apples", "apples", "apples", "apples")), class = "data.frame", row.names = c(NA, -4L))
oranges <- structure(list(Def = c("price", "stock", "revenue", "profit"), `Q2-2021` = c(14, 2860, 53030, 27350), `Q3-2021` = c(5, 105, 13053, 5552), ID = c("oranges", "oranges", "oranges", "oranges")), class = "data.frame", row.names = c(NA, -4L))
  
l <- list(apples = apples,
          oranges = oranges)

merged_data <- l %>% 
  map_dfr(~.x %>% 
            pivot_longer(starts_with("Q"), names_to = "quarter") %>% 
            separate(quarter, into = c("quarter", "year")) %>% 
            unite("date", year, quarter))

merged_data
#> # A tibble: 20 × 4
#>    Def     ID      date    value
#>    <chr>   <chr>   <chr>   <dbl>
#>  1 price   apples  2021_Q1    15
#>  2 price   apples  2021_Q2    17
#>  3 price   apples  2021_Q3     9
#>  4 stock   apples  2021_Q1  1270
#>  5 stock   apples  2021_Q2  3575
#>  6 stock   apples  2021_Q3   175
#>  7 revenue apples  2021_Q1 57866
#>  8 revenue apples  2021_Q2 66287
#>  9 revenue apples  2021_Q3 21755
#> 10 profit  apples  2021_Q1 22156
#> 11 profit  apples  2021_Q2 34187
#> 12 profit  apples  2021_Q3  9253
#> 13 price   oranges 2021_Q2    14
#> 14 price   oranges 2021_Q3     5
#> 15 stock   oranges 2021_Q2  2860
#> 16 stock   oranges 2021_Q3   105
#> 17 revenue oranges 2021_Q2 53030
#> 18 revenue oranges 2021_Q3 13053
#> 19 profit  oranges 2021_Q2 27350
#> 20 profit  oranges 2021_Q3  5552

merged_data %>% 
  # filter(Def == "price") %>% # this will just plot price
  ggplot(aes(date, value, color = ID))  
  geom_point()  
  geom_line(aes(group = ID))  
  facet_wrap(~Def, scales = "free_y")

Created on 2022-11-21 with reprex v2.0.2

  • Related