this is probably a very basic question but I'm just starting out using R and hope someone can help.
I've imported some data into R and created an object containing just the data I'm working on first:
Each of the values is from a scale of 1 to 10.
What I want to produce is a chart showing the mean of each column, something like this (which I did in Excel):
I'm sure this is possible, but I'm going round in circles figuring it out! Ignoring the vertical line (at maximum value) and standard deviations for now, though ultimately I'd like to have them included. Thank you!
CodePudding user response:
set.seed(42)
dat <- setNames(data.frame(replicate(4, sample(10, 50, replace=TRUE))), c("2000", "2400", "2800", "3200"))
head(dat)
# 2000 2400 2800 3200
# 1 1 6 5 1
# 2 5 6 9 1
# 3 1 2 10 5
# 4 9 4 8 3
# 5 10 3 7 10
# 6 4 6 6 1
library(dplyr)
library(tidyr) # pivot_longer
library(ggplot2)
dat %>%
pivot_longer(everything()) %>%
group_by(name) %>%
summarize(value = mean(value), .groups = "drop") %>%
mutate(name = as.integer(name)) %>%
ggplot(aes(name, value)) geom_line()
CodePudding user response:
It seems that you have encoded a numerical value in the column name, which is not a good idea, because it is a violation of the first normal form. I would thus suggest to transpose the data and encode the first value in the first column.
With your peculiar data structure, you must first extract the number from the colmn names with
x <- as.numeric(names(dat))
Then you can compute all column means with
y <- colMeans(dat)
And then you can plot it
plot(x, y, type="l")