Home > Mobile >  Trying to automate graphing columns from a dataframe in R
Trying to automate graphing columns from a dataframe in R

Time:09-28

I found this question and tried to apply it to my own situation, but I'm still pretty new to R. I'm trying to take a subset of columns from a dataframe, plot them against every other column from that subset across the time column, then save those plots.

library(dplyr)
library(ggplot2)
library(ggpubr)
    auto_graph <-function(.data, x, y){
        myplot <- ggplot(.data,
        mapping = aes(x = .data[[x]], y = .data[[y]]))   
        geom_point(color = "cornflowerblue",
                    alpha = .5,
                    size = 3)  
        geom_smooth(method = "lm")  
        stat_regline_equation(label.y = 25, aes(label = ..eq.label..))  
        stat_regline_equation(label.y = 20, aes(label = ..rr.label..))  
        facet_wrap(~time)
        myplot <- myplot   labs(title = paste(names(x), names(y)))

        #ggsave(filename=paste(names(x), names(y),".png"), myplot, path="C:/Users/...")
        return myplot
}

foo <- lapply(temp, function(x) auto_graph(x, across(3:9), across(3:9)))
foo[[1]]

Now the across didn't work due to lapply not being a dlpr function, but I left it there to give an idea of what I'm trying to do. I did also try using a for-loop first, but whatever way I assigned the x and y in the loop, they ended up as "tbl_df/tbl/data.frame" objects and the plots came up empty.

CodePudding user response:

You haven't provided a test dataset and I don't have some of the packages you are using, so I will use the mtcars data frame and a slightly modified auto_graph function to illustrate the general technique. You should be able to modify for your actual use case easily.

First, the modified auto_graph function.

auto_graph <-function(.data, x, y) {
  myplot <- ggplot(.data,
                   mapping = aes(x = .data[[x]], y = .data[[y]]))   
    geom_point(color = "cornflowerblue",
               alpha = .5,
               size = 3)  
    geom_smooth(method = "lm") 
  #  
    # stat_regline_equation(label.y = 25, aes(label = ..eq.label..))  
    # stat_regline_equation(label.y = 20, aes(label = ..rr.label..))  
    # facet_wrap(~time)
  myplot <- myplot   labs(title = paste(names(x), names(y)))
  
  return(myplot)
}

[Note the correction of a typo in the final line of the function.]

The first part of the problem is to generate all pairs of columns to plot. Passing your column names to combn() makes this a breeze. I use t() to transpose the output from combn() to avoid a call to pivot_longer() and messy warning messages about unnamed columns.

colsToPlot <- as_tibble(t(combn(names(mtcars), 2)))
colsToPlot
# A tibble: 55 × 2
   V1    V2   
   <chr> <chr>
 1 mpg   cyl  
 2 mpg   disp 
 3 mpg   hp   
 4 mpg   drat 
 5 mpg   wt   
 6 mpg   qsec 
 7 mpg   vs   
 8 mpg   am   
 9 mpg   gear 
10 mpg   carb 
# … with 45 more rows

Now use our tibble of column pairs as the start of a pipe to rowwise() and group_map()...

colsToPlot %>% 
  rowwise() %>% 
  group_map(
    function(.x, .y) {
      auto_graph(mtcars, .x$V1, .x$V2)
    }
  )

group_map returns a list containing the results of applying the function defined in its argument to each group of the input data frame. I've grouped the list of column pairs by row, so that means the function is applied to every pair of columns. The function follows the convention used in the online doc: .x is a data frame containing the data of the current group: the names of the columns to used to define the x and y axes, in this case. The .y argument is a single row data frame containing the definition of the current group. It's irrelevant here.

I won't post the resulting list of 55 graphs here, but it works.

All the functions used in this solution reside in the tidyverse or base R, so library(tidyverse) should resolve any could not find function... errors.

  • Related