Home > Blockchain >  Species list: long to wide based on date and temperature average in R
Species list: long to wide based on date and temperature average in R

Time:10-28

I have a list of species with details on temperature and other variables that need to be in a wide format to compute analysis. I need to have one row per date and site, and all other variables, like temperature, just the average of the date but respecting the site. Find a diagram of what I need belowenter image description here

I have found ways to do it, but I can not find a way to get the other variables, for example, the temperature or cloud cover averaged by day. I hope someone can help me

CodePudding user response:

where do the numbers come from in the columns starting with sp?

CodePudding user response:

Replace the values in the columns you want to average with the averages before pivoting. Your example data doesn’t illustrate the problem of having varying temps within a site/date, so I modified the data a bit:

library(tidyverse)

tbl <- tibble(
  site = c("A", "A", "A", "B", "B", "B"),
  sp = c("sp1", "sp1", "sp2", "sp1", "sp3", "sp4"),
  day = c(1, 1, 2, 2, 2, 3),
  temp = c(17, 20, 16, 18, 18, 20)
)

tbl
#> # A tibble: 6 x 4
#>   site  sp      day  temp
#>   <chr> <chr> <dbl> <dbl>
#> 1 A     sp1       1    17
#> 2 A     sp1       1    20
#> 3 A     sp2       2    16
#> 4 B     sp1       2    18
#> 5 B     sp3       2    18
#> 6 B     sp4       3    20

And here’s the averaging step added:

tbl |>
  group_by(site, day) |>
  mutate(across(where(is.numeric), mean)) |>
  pivot_wider(
    names_from = sp,
    values_from = sp,
    values_fn = length,
    values_fill = 0L
  )
#> # A tibble: 4 x 7
#> # Groups:   site, day [4]
#>   site    day  temp   sp1   sp2   sp3   sp4
#>   <chr> <dbl> <dbl> <int> <int> <int> <int>
#> 1 A         1  18.5     2     0     0     0
#> 2 A         2  16       0     1     0     0
#> 3 B         2  18       1     0     1     0
#> 4 B         3  20       0     0     0     1
  •  Tags:  
  • r
  • Related