I want to make a simple bar chart showing the mean and standard deviation of two columns of a csv, but I cant seem to figure it out.
For example, say the csv looks like this.
unfiltered_data,filtered_data
2,1
3,4
5,6
7,8
Then using
test <- read.csv("Performance Metric Testing/test.csv")
ggplot(test, aes(unfiltered_data, filtered_data))
geom_bar(stat = "summary", fun.y = "mean")
Outputs an odd 4 bar graph. I cant seem to understand how to use the ggplot package.
CodePudding user response:
As is quite often the case you have to convert your data to long or tidy format to get a barchart showing the means for your columns:
test_long <- tidyr::pivot_longer(test, everything(), names_to = "data")
test_long
#> # A tibble: 8 × 2
#> data value
#> <chr> <int>
#> 1 unfiltered_data 2
#> 2 filtered_data 1
#> 3 unfiltered_data 3
#> 4 filtered_data 4
#> 5 unfiltered_data 5
#> 6 filtered_data 6
#> 7 unfiltered_data 7
#> 8 filtered_data 8
library(ggplot2)
ggplot(test_long, aes(data, value))
stat_summary(fun = "mean", geom = "col")
stat_summary(fun.data = "mean_se")
DATA
test <- structure(list(unfiltered_data = c(2L, 3L, 5L, 7L), filtered_data = c(
1L,
4L, 6L, 8L
)), class = "data.frame", row.names = c(NA, -4L))