Home > Blockchain >  Plot means of a dataset where each column is a different day
Plot means of a dataset where each column is a different day

Time:09-22

Hi I have this probably noob question but I haven't touch R in quite a while, I have this big dataset where each column is the result of measurements in a day (day 1, day2, etc), and the rows are the different treatments replicates. I already managed to make a table with all the means and reduce the data to where I have the treatments as rows and the mean of each day in columns. Now I want to plot this data as scatter or lines but I don't seem to fire out what to use as the aes(x=) and is there a way to plot them all using a single code and not adding each geom_point() for each day and treatment.

The following is an example because the data is longer and more complex (24 days and 28 treatments with each 10 replicate total) How could I plot the data to look like this (Excel image)enter image description here Thank you in advance to everyone, any help or feedback would be highly appreciated

#df#
treatment    day 1    day 2    day 3
1        t1 7.524814 8.330983 6.639391
2        t1 6.056334 6.138648 5.439239
3        t2 4.377818 4.964445 3.990593
4        t1 6.834753 7.070450 5.895462
5        t3 7.378768 8.375725 7.210010
6        t2 4.104087 4.942359 3.589360
7        t2 4.520651 4.775113 3.753422
8        t3 7.875438 8.543303 8.101697
9        t3 7.803648 8.232132 7.073342

mean<-aggregate(df[,2:4],list(df$treatment),mean)
sd<-aggregate(df[,2:4],list(df$treatment),sd)

#mean#
Group.1    day 1    day 2    day 3
1      t1 6.805300 7.180027 5.991364
2      t2 4.334185 4.893972 3.777792
3      t3 7.685951 8.383720 7.461683

ggplot() geom_point(mean,aes(x=??,y=mean$"day 1")

CodePudding user response:

There are a couple of ways how you could achieve your task:

  1. bring your data in long format.
  2. some data wrangling
  3. ggplot() Version1:
library(tidyverse)
  df %>% 
    pivot_longer(
      cols = -treatment,
      names_to = "day",
      values_to = "values"
    ) %>% 
    group_by(treatment, day) %>% 
    summarise(mean = mean(values)) %>% 
    ggplot(aes(x=day, y=mean, color=treatment, group=treatment))   
    geom_line()

enter image description here

Version 2

library(tidyverse)
df %>% 
  pivot_longer(
    cols = -treatment,
    names_to = "day",
    values_to = "values"
  ) %>% 
  group_by(day) %>% 
  summarise(mean = mean(values)) %>% 
  ggplot(aes(x=day, y=mean, group=1))   
  geom_point()  
  geom_line(colour="red")

enter image description here

CodePudding user response:

ggplot likes data in "long" format. Here is one way to do this for mean values, you can do the same for sd.

library(tidyverse)

df %>%
  pivot_longer(cols = -treatment) %>%
  group_by(treatment, name = factor(name, unique(name))) %>%
  summarise(value = mean(value), .groups = 'drop') %>%
  ggplot(aes(name, value, color = treatment, group = treatment))   geom_line()

enter image description here

  • Related