Home > Software engineering >  connected scatter plot by 2 groups
connected scatter plot by 2 groups

Time:11-23

I have the following data set:

df2.summ <- structure(list(Species = c("Large meso-predator", "Large meso-predator", 
"Large meso-predator", "Large meso-predator", "Large meso-predator", 
"Large meso-predator", "Large meso-predator", "Large meso-predator", 
"Large meso-predator", "Large meso-predator", "Large meso-predator", 
"Large meso-predator", "Large meso-predator", "Large meso-predator", 
"Large meso-predator", "Large meso-predator", "Large meso-predator", 
"Large meso-predator", "Apex predator", "Apex predator", "Apex predator", 
"Apex predator", "Apex predator", "Apex predator", "Apex predator", 
"Apex predator", "Apex predator", "Apex predator", "Apex predator", 
"Apex predator"), Habitat = c("lagoon", "lagoon", "lagoon", "lagoon", 
"lagoon", "lagoon", "bank", "bank", "bank", "bank", "bank", "bank", 
"pelagic", "pelagic", "pelagic", "pelagic", "pelagic", "pelagic", 
"bank", "bank", "bank", "bank", "bank", "bank", "pelagic", "pelagic", 
"pelagic", "pelagic", "pelagic", "pelagic"), detect_dist = c(1.5, 
2, 2.5, 3, 3.5, 4, 1.5, 2, 2.5, 3, 3.5, 4, 1.5, 2, 2.5, 3, 3.5, 
4, 1.5, 2, 2.5, 3, 3.5, 4, 1.5, 2, 2.5, 3, 3.5, 4), avg = c(37.4509803921569, 
37.1960784313725, 37.8235294117647, 37.3725490196078, 37.0980392156863, 
38.2156862745098, 27.3137254901961, 29.6862745098039, 28.6274509803922, 
28.5098039215686, 28.2941176470588, 27.3921568627451, 35.2352941176471, 
33.078431372549, 33.5490196078431, 34.0392156862745, 34.5686274509804, 
34.3725490196078, 7.19607843137255, 7.05882352941176, 7.33333333333333, 
7.23529411764706, 7.37254901960784, 7.35294117647059, 2.47058823529412, 
2.6078431372549, 2.49019607843137, 2.45098039215686, 2.2156862745098, 
2.37254901960784), sd = c(4.49583685420277, 6.23544579911697, 
4.68062338734037, 4.74957170411702, 4.5618193824867, 4.90841614164975, 
4.81036462683831, 5.84975280188294, 3.64944261121462, 5.04131946624932, 
4.02638357659604, 5.1926040918697, 4.47923312764191, 5.50579017854804, 
4.8303777305308, 4.84545471267135, 4.36923289359029, 4.64310578950653, 
1.45629128738913, 1.50215531428521, 1.25962957517941, 1.49115036524313, 
1.45548320929821, 1.29342227306885, 1.34689184683063, 1.45709891733607, 
1.22266183418978, 1.3610837665654, 1.4044746418529, 1.2800122548433
)), class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA, 
-30L))

which looks like this:

# A tibble: 30 × 5
   Species             Habitat detect_dist   avg    sd
   <chr>               <chr>         <dbl> <dbl> <dbl>
 1 Large meso-predator lagoon          1.5  37.5  4.50
 2 Large meso-predator lagoon          2    37.2  6.24
 3 Large meso-predator lagoon          2.5  37.8  4.68
 4 Large meso-predator lagoon          3    37.4  4.75
 5 Large meso-predator lagoon          3.5  37.1  4.56
 6 Large meso-predator lagoon          4    38.2  4.91
 7 Large meso-predator bank            1.5  27.3  4.81
 8 Large meso-predator bank            2    29.7  5.85
 9 Large meso-predator bank            2.5  28.6  3.65
10 Large meso-predator bank            3    28.5  5.04
# … with 20 more rows

I would like to draw a connected scatter plot, connecting data points within each habitat-species combination only, but i cannot get this to work. Whatever parameters I choose for the geom_line() or even geom_path() function, i get different combinations of connections that connect among species-habitat combinations.

Here is the code of a grouped scatter plot:

ggplot(df2.summ, aes(x = detect_dist, y = avg, group = Habitat))  
  geom_point(aes(color = Habitat, shape = Species), size = 2)  
  geom_errorbar(aes(ymin = avg - sd, ymax = avg   sd, color = Habitat), size = .5, width=.01)  
  scale_color_manual(values = hab.col2)  
  xlab('Threat detection distance')   ylab('Abundance')  
  theme_bw()  
  theme(text = element_text(size = 12, family = "Times"))

Here is one of the codes that doesn't yield the desired result:

ggplot(df2.summ, aes(x = detect_dist, y = avg, group = Habitat))  
  geom_point(aes(color = Habitat, shape = Species), size = 2)  
  geom_path(aes(color = Habitat, group = Habitat))  
  geom_errorbar(aes(ymin = avg - sd, ymax = avg   sd, color = Habitat), size = .5, width=.01)  
  scale_color_manual(values = hab.col2)  
  xlab('Threat detection distance')   ylab('Abundance')  
  theme_bw()  
  theme(text = element_text(size = 12, family = "Times"))

Any help is greatly appreciated.

CodePudding user response:

After little adaptions of your code, we could do it this way:

library(ggplot2)

ggplot(df2.summ , aes(x = detect_dist, y = avg, color=Habitat, shape = Species))  
  geom_point(size = 2)  
  geom_errorbar(aes(ymin = avg - sd, ymax = avg   sd, color = Habitat), size = .5, width=.01)  
  geom_line() 
 # scale_color_manual(values = hab.col2)  
  xlab('Threat detection distance')   ylab('Abundance')  
  theme_bw()  
  theme(text = element_text(size = 12, family = "Times"))

enter image description here

  • Related