Home > Blockchain >  how to add a regression line for only part of the dataset by ggplot2?
how to add a regression line for only part of the dataset by ggplot2?

Time:12-30

I have a data set as below,

ALL<- structure(list(GI = c(38.448275862069, 40.2659574468085, 85.3378378378378, 
56.4606741573034, 26.5714285714286, 16.8944099378882), GI_D = c(31.5275862068966, 
37.0446808510638, 64.0033783783784, 45.7331460674157, 20.7257142857143, 
14.1913043478261), GI_W = c(34.84375, 39.4270833333333, 83.0921052631579, 
54.6195652173913, 25.5963302752294, 16.4848484848485), NEE_D_mean = c(9.9644036070938, 
-5.49181483024952, -29.5841687938457, -10.950117466455, -9.76133775037159, 
-1.17370950853892), NEE_D_se = c(24.4055666454516, 8.31286897717958, 
43.0803839446216, 42.0054504158082, 28.7765100449838, 8.86774764999355
), NEE_W_mean = c(-10.6866769282934, 20.9456806199394, -24.0380682586804, 
52.3723812566745, -62.2858574112861, 56.3557615426375), NEE_W_se = c(15.2426118086142, 
17.8227858145903, 22.7452815581715, 38.4251278858896, 19.1950340008666, 
25.59062272811), GPP_D_mean = c(2.76586256588453, -14.0740484535984, 
22.0551675189495, 38.2196758481854, -22.2452106112792, 2.92247497333855
), GPP_D_se = c(10.0301104827162, 4.76830515667558, 10.1200654792974, 
13.6220945562145, 12.5521089272372, 4.02070599220442), GPP_W_mean = c(-13.3583364224079, 
5.5457128851295, 6.96224944388818, 30.9347346550519, -24.0637392356731, 
31.1919112040759), GPP_W_se = c(7.79177565854901, 7.68225824264646, 
7.53759987843893, 9.21062180693269, 11.5998936888688, 4.91032534186175
), RE_D_mean = c(-6.92656657644594, -20.2249090077204, -1.55891573291113, 
15.3619823271736, -59.6169736724781, 0.0398744940922411), RE_D_se = c(8.81296607135718, 
3.17951327169943, 7.26103092218914, 9.79375075847273, 33.89046634443, 
3.15632251128507), RE_W_mean = c(-11.2826765406364, -5.50930629197934, 
-7.35527862198859, -3.3802491396303, -5.7039196948544, 15.5927675710877
), RE_W_se = c(7.82782177993256, 3.28089787167971, 5.27000717925753, 
5.7667863399033, 10.1830962186111, 3.17699751136105), site = c("DK_M", 
"DK_B", "UK", "NL", "HU", "IT")), row.names = c(NA, -6L), class = "data.frame")

I want to add a regression line when GI>39, but I have no idea why the line didn't show up with my code.

  ggplot(ALL, mapping = aes(x=GI, y=NEE_D_mean, color= site)) 
  geom_point() 
  geom_hline(yintercept = 0)  #add a horizontal line= 0
  geom_errorbar(aes(ymin=NEE_D_mean-NEE_D_se, ymax=NEE_D_mean NEE_D_se), width=0.5, size=1)  
  labs(y='Drought change of NEE from control % ', x= 'Gaussen Index of Aridity', color= ' ')   #here, note: x and y axis title is reversed.
  #geom_smooth (data = subset(ALL, GI >= 39),aes(x=GI, y=NEE_D_mean),method='lm')  
  geom_smooth(method = lm) 
  theme_bw() 
  #theme(panel.grid.major = element_blank(),panel.grid.minor = element_blank())  
  theme(legend.position = c(0.85, 0.3))   #change the legend position
  theme(legend.title = element_blank())  #Change the legend title to blank
  theme_bw() 
  theme(panel.grid.major = element_blank(),panel.grid.minor = element_blank()) 

Hope someone could help me out. Why there is no regression line in my graph?

CodePudding user response:

You don't see a line in your code because the geom_smooth layer inherits the color = site aesthetic, and will therefore fit a separate regression line for each site. Since you have 1 data point per site, no regressions are produced.

We can set the geom_smooth color to be a constant, thus grouping all the sites together. And since you only want ot include observations with GI > 39 we can add that filter to the data for that layer (looks like you had attempted this in commented out code, but were not successful due to the color issue):

ggplot(ALL, mapping = aes(x=GI, y=NEE_D_mean, color= site)) 
  geom_point() 
  geom_hline(yintercept = 0)  #add a horizontal line= 0
  geom_errorbar(aes(ymin=NEE_D_mean-NEE_D_se, ymax=NEE_D_mean NEE_D_se), width=0.5, size=1)  
  labs(y='Drought change of NEE from control % ', x= 'Gaussen Index of Aridity', color= ' ')   #here, note: x and y axis title is reversed.
  geom_smooth(data = subset(ALL, GI > 39), method = lm, color = "red") 
  theme_bw() 
  #theme(panel.grid.major = element_blank(),panel.grid.minor = element_blank())  
  theme(legend.position = c(0.85, 0.3))   #change the legend position
  theme(legend.title = element_blank())  #Change the legend title to blank
  theme_bw() 
  theme(panel.grid.major = element_blank(),panel.grid.minor = element_blank()) 

enter image description here

  • Related