Home > Mobile >  Barplot the mean of two columns in R
Barplot the mean of two columns in R

Time:04-15

I want to make a simple bar chart showing the mean and standard deviation of two columns of a csv, but I cant seem to figure it out.

For example, say the csv looks like this.

unfiltered_data,filtered_data
2,1
3,4
5,6
7,8

Then using

test <- read.csv("Performance Metric Testing/test.csv")

ggplot(test, aes(unfiltered_data, filtered_data))   
  geom_bar(stat = "summary", fun.y = "mean")

Outputs an odd 4 bar graph. I cant seem to understand how to use the ggplot package.

enter image description here

CodePudding user response:

As is quite often the case you have to convert your data to long or tidy format to get a barchart showing the means for your columns:

test_long <- tidyr::pivot_longer(test, everything(), names_to = "data")
test_long
#> # A tibble: 8 × 2
#>   data            value
#>   <chr>           <int>
#> 1 unfiltered_data     2
#> 2 filtered_data       1
#> 3 unfiltered_data     3
#> 4 filtered_data       4
#> 5 unfiltered_data     5
#> 6 filtered_data       6
#> 7 unfiltered_data     7
#> 8 filtered_data       8

library(ggplot2)

ggplot(test_long, aes(data, value))  
  stat_summary(fun = "mean", geom = "col")  
  stat_summary(fun.data = "mean_se")

DATA

test <- structure(list(unfiltered_data = c(2L, 3L, 5L, 7L), filtered_data = c(
  1L,
  4L, 6L, 8L
)), class = "data.frame", row.names = c(NA, -4L))
  • Related