Home > Software engineering >  How to create regression curves with use of dplyr in R
How to create regression curves with use of dplyr in R

Time:11-25

I have a dataflie like this (in my original file i have 4 categories of organisms)

organism    length   intersize
org1           201         38 
org1           334         4221    
org2           428         575  
org2           573         639  
org3           356         700
org3           2414        978 

i created a dplyr object and i made plots for length and intersize. i would like to calculate and desing a plot of regression curves per organism and include global regression line. How can i do it in R ?

CodePudding user response:

geom_smooth(mapping = aes(color = organism)) will do the regression curves on each group as defined in column organism separately:

library(tidyverse)

data <- tribble(
    ~organism, ~Gene_length, ~intersize,
    "org1",         201   ,      38,
    "org1",         334   ,    4221,
    "org2",         428   ,     575,
    "org2",        573    ,     639,
    "org3",       356     ,    700,
    "org3",      2414     ,     978
  )

data |>
  ggplot(aes(Gene_length, intersize))  
    geom_point(aes(color = organism))  
    geom_smooth(aes(color = "pooled"), method = "lm", se = FALSE)  
    geom_smooth(aes(color = organism), method = "lm", se = FALSE)
#> `geom_smooth()` using formula 'y ~ x'
#> `geom_smooth()` using formula 'y ~ x'

Created on 2022-11-24 by the reprex package (v2.0.1)

CodePudding user response:

Here is a basic plot in base R.

df <- read.table(text=
"
  organism    length   intersize
  org1           201         38 
  org1           334         4221    
  org2           428         575  
  org2           573         639  
  org3           356         700
  org3           2414        978 
", header = T)


with(df, plot(length, intersize, main = "Base R Plot", type = "n"))
with(subset(df, organism == "org1"), points(length, intersize, col = "blue"))
with(subset(df, organism == "org1"), abline(lm(intersize~length), col = "blue"))
with(subset(df, organism == "org2"), points(length, intersize, col = "red"))
with(subset(df, organism == "org2"), abline(lm(intersize~length), col = "red"))
with(subset(df, organism == "org3"), points(length, intersize, col = "green"))
with(subset(df, organism == "org3"), abline(lm(intersize~length), col = "green"))
with(df, abline(lm(intersize~length), col = "black"))
legend("topright", pch = 1, col = c("blue", "red","green", "black"), legend = c("org1", "org2", "org3", "pooled"))

  •  Tags:  
  • r
  • Related