Species list: long to wide based on date and temperature average in R-CodePudding

I have a list of species with details on temperature and other variables that need to be in a wide format to compute analysis. I need to have one row per date and site, and all other variables, like temperature, just the average of the date but respecting the site. Find a diagram of what I need below

I have found ways to do it, but I can not find a way to get the other variables, for example, the temperature or cloud cover averaged by day. I hope someone can help me

CodePudding user response：

where do the numbers come from in the columns starting with sp?

CodePudding user response：

Replace the values in the columns you want to average with the averages before pivoting. Your example data doesn’t illustrate the problem of having varying temps within a site/date, so I modified the data a bit:

library(tidyverse)

tbl <- tibble(
  site = c("A", "A", "A", "B", "B", "B"),
  sp = c("sp1", "sp1", "sp2", "sp1", "sp3", "sp4"),
  day = c(1, 1, 2, 2, 2, 3),
  temp = c(17, 20, 16, 18, 18, 20)
)

tbl
#> # A tibble: 6 x 4
#>   site  sp      day  temp
#>   <chr> <chr> <dbl> <dbl>
#> 1 A     sp1       1    17
#> 2 A     sp1       1    20
#> 3 A     sp2       2    16
#> 4 B     sp1       2    18
#> 5 B     sp3       2    18
#> 6 B     sp4       3    20

And here’s the averaging step added:

tbl |>
  group_by(site, day) |>
  mutate(across(where(is.numeric), mean)) |>
  pivot_wider(
    names_from = sp,
    values_from = sp,
    values_fn = length,
    values_fill = 0L
  )
#> # A tibble: 4 x 7
#> # Groups:   site, day [4]
#>   site    day  temp   sp1   sp2   sp3   sp4
#>   <chr> <dbl> <dbl> <int> <int> <int> <int>
#> 1 A         1  18.5     2     0     0     0
#> 2 A         2  16       0     1     0     0
#> 3 B         2  18       1     0     1     0
#> 4 B         3  20       0     0     0     1