Home > Software engineering >  Why is the placement of the error bars incorrect in this barchart using ggplot?
Why is the placement of the error bars incorrect in this barchart using ggplot?

Time:12-14

it seems that my error bar is placed incorrectly in my plot, and i cant quite figure out why, i must have missed something...

My plot looks like this: enter image description here

My data looks like this:

# A tibble: 9 × 6
# Groups:   variable [3]
  variable       gfp   stdev gfp_mean     ymax     ymin
  <chr>        <dbl>   <dbl>    <dbl>    <dbl>    <dbl>
1 gfp_sums   4286898 466912. 4478746  4945658. 4011834.
2 nc_sums     664845   4378.  662308.  666686.  657930.
3 media_sums  778269  29403.  744335   773738.  714932.
4 gfp_sums   5011021 466912. 4478746  4945658. 4011834.
5 nc_sums     657253   4378.  662308.  666686.  657930.
6 media_sums  726416  29403.  744335   773738.  714932.
7 gfp_sums   4138319 466912. 4478746  4945658. 4011834.
8 nc_sums     664827   4378.  662308.  666686.  657930.
9 media_sums  728320  29403.  744335   773738.  714932.

And it is plotted using this

ggplot(gfp_long) 
  geom_bar( aes(x=variable, y=gfp_mean), stat="identity", fill="skyblue", alpha=0.7)  
  geom_errorbar( aes(x=variable, ymin=ymin, ymax=ymax), width=0.4, colour="orange", alpha=0.9, size=1.3)

Thanks in advance.

dput dataframe:

structure(list(variable = c("gfp_sums", "nc_sums", "media_sums", 
"gfp_sums", "nc_sums", "media_sums", "gfp_sums", "nc_sums", "media_sums"
), gfp = c(4286898, 664845, 778269, 5011021, 657253, 726416, 
4138319, 664827, 728320), stdev = c(466911.593911524, 4378.05634195511, 
29403.1217900413, 466911.593911524, 4378.05634195511, 29403.1217900413, 
466911.593911524, 4378.05634195511, 29403.1217900413), gfp_mean = c(4478746, 
662308.333333333, 744335, 4478746, 662308.333333333, 744335, 
4478746, 662308.333333333, 744335), ymax = c(4945657.59391152, 
666686.389675288, 773738.121790041, 4945657.59391152, 666686.389675288, 
773738.121790041, 4945657.59391152, 666686.389675288, 773738.121790041
), ymin = c(4011834.40608848, 657930.276991378, 714931.878209959, 
4011834.40608848, 657930.276991378, 714931.878209959, 4011834.40608848, 
657930.276991378, 714931.878209959)), class = c("grouped_df", 
"tbl_df", "tbl", "data.frame"), row.names = c(NA, -9L), groups = structure(list(
    variable = c("gfp_sums", "media_sums", "nc_sums"), .rows = structure(list(
        c(1L, 4L, 7L), c(3L, 6L, 9L), c(2L, 5L, 8L)), ptype = integer(0), class = c("vctrs_list_of", 
    "vctrs_vctr", "list"))), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -3L), .drop = TRUE))

CodePudding user response:

You have the same variable name repeated three times, so it is trying to plot over itself.

If you look at table(gfp_long$variable), you have:

#  gfp_sums media_sums    nc_sums 
#         3          3          3 

gap_long[,1]

#  variable  
#  <chr>     
#1 gfp_sums  
#2 nc_sums   
#3 media_sums
#4 gfp_sums  
#5 nc_sums   
#6 media_sums
#7 gfp_sums  
#8 nc_sums   
#9 media_sums

So if you plot just the first three rows (the first instance of all three variable types) it works fine: enter image description here

ggplot(gfp_long[1:3,])  
  geom_bar(aes(x = variable, y = gfp_mean), stat = "identity", 
           fill = "skyblue", alpha = 0.7)  
  geom_errorbar(aes(x = variable, ymin = ymin, ymax = ymax), 
                width = 0.4, colour = "orange", alpha = 0.9, size = 1.3)

I dont know enough about what you want to do to provide further assistance.

You may want to make a group variable and facet:

gfp_long$group <- rep(1:3, each = 3)

ggplot(gfp_long)  
  geom_bar(aes(x = variable, y = gfp_mean), stat = "identity", 
           fill = "skyblue", alpha = 0.7)  
  geom_errorbar(aes(x = variable, ymin = ymin, ymax = ymax), 
                width = 0.4, colour = "orange", alpha = 0.9, size = 1.3)  
  facet_grid(~group)

enter image description here

  • Related