Home > Back-end >  Extend line length with geom_line
Extend line length with geom_line

Time:01-05

I want to represent three lines on a graph overlain with datapoints that I used in a discriminant function analysis. From my analysis I have two points that fall on each line and I want to represent these three lines. The lines represent probability contours of the classification scheme and exactly how I got the points on the line are not relevant to my question here. However, I want the lines to extend further than the points that define them.

df <- 
  data.frame(Prob = rep(c("5", "50", "95"), each=2),
             Wing = rep(c(107,116), 3),
             Bill = c(36.92055, 36.12167, 31.66012, 30.86124, 26.39968, 25.6008))

ggplot() 
  geom_line(data=df, aes(x=Bill, y=Wing, group=Prob, color=Prob))`

The above df is a dataframe for my points from which the three lines are constructed. I want the lines to extend from y=105 to y=125. Thanks!

CodePudding user response:

There are probably more idiomatic ways of doing it but this is one way to get it done.

In short you quickly calculate the linear formula that will connect the lines i.e y = mx c

df_withFormula <- df |>
  group_by(Prob) |>
  #This mutate command will create the needed slope and intercept for the geom_abline command in the plotting stage.
  mutate(increaseBill = Bill - lag(Bill),
         increaseWing = Wing - lag(Wing),
         slope = increaseWing/increaseBill,
         intercept = Wing - slope*Bill)
# The increaseBill, increaseWing and slope could all be combined into one calculation but I thought it was easier to understand this way.

ggplot(df_withFormula, aes(Bill, Wing, color = Prob))  
  #Add in this just so it has something to plot ontop of. You could remove this and instead manually define all the limits (expand_limits would work).
  geom_point()  
  #This plots the three lines. The rows with NA are automatically ignored. More explicit handling of the NA could be done in the data prep stage
  geom_abline(aes(slope = slope, intercept = intercept, color = Prob))  
  #This is the crucial part it lets you define what the range is for the plot window. As ablines are infite you can define whatever limits you want.
  expand_limits(y = c(105,125))

Hope this helps you get the graph you want.

This is very much dependent on the structure of your data it could though be changed to fit different shapes.

CodePudding user response:

Similar to the approach by @James in that I compute the slopes and the intercepts from the given data and use a geom_abline to plot the lines but uses

  • summarise instead of mutate to get rid of the NA values
  • and a geom_blank instead of a geom_point so that only the lines are displayed but not the points (Note: Having another geom is crucial to set the scale or the range of the data and for the lines to show up).
library(dplyr)
library(ggplot2)

df_line <- df |> 
  group_by(Prob) |> 
  summarise(slope = diff(Wing) / diff(Bill),
            intercept = first(Wing) - slope * first(Bill))

ggplot(df, aes(x = Bill, y = Wing))  
  geom_blank()  
  geom_abline(data = df_line, aes(slope = slope, intercept = intercept, color = Prob))  
  scale_y_continuous(limits = c(105, 125))

  • Related