I have the following dataframe morphology
:
month site depth num.core num.plant num.leaf
<chr> <chr> <dbl> <dbl> <dbl> <dbl>
1 Oct SB 12 1 1 5
2 Oct SB 12 1 2 29
3 Oct SB 12 1 3 7
4 Oct SB 12 2 1 9
5 Oct SB 12 2 2 4
6 Oct SB 12 2 3 13
My aim if to count number of plants (num.plant
) per core (num.core
), at set date (month
), and depth
.
I have grouped the dataframe and counted the number of plants per core as I need:
morpho.group <- morphology %>%
group_by(month, site, num.core, depth) %>%
count(month,site,num.core,depth, name = "plant.count.Xcore")
month site num.core depth plant.count.Xcore
<chr> <chr> <dbl> <dbl> <int>
1 Dec D 1 3 4
2 Dec D 2 3 2
3 Dec D 3 3 3
4 Dec D 4 3 3
5 Dec N 1 12 1
6 Dec N 2 12 5
My issue is that I need to perform more actions on the morphology
dataframe such as summing the number of leaves per core such as:
count.morpho <- morphology %>%
group_by(month, site, num.core, depth) %>%
summarise_at(vars("num.leaf", "num.roots"), sum)
month site num.core depth num.leaf num.roots
<chr> <chr> <dbl> <dbl> <dbl> <dbl>
1 Dec D 1 3 11 13
2 Dec D 2 3 17 8
3 Dec D 3 3 14 4
4 Dec D 4 3 40 10
5 Dec N 1 12 3 2
6 Dec N 2 12 40 10
I need to perform these actions such that they are continues and adds up to a single dataframe instead of pulling each calculated column to a new dataframe.
Any help is much appreciated :)
CodePudding user response:
count
is really just a convenience function to look at n()
for the groups, you can include it more literally and add other metrics.
(FYI, your data doesn't include num.roots
, so I replaced it with num.plant
here just for demonstration.)
morphology %>%
group_by(month, site, num.core, depth) %>%
summarize(
plant.count.Xcore = n(),
across(c(num.leaf, num.plant), sum)
) %>%
ungroup()
# # A tibble: 2 x 7
# month site num.core depth plant.count.Xcore num.leaf num.plant
# <chr> <chr> <int> <int> <int> <int> <int>
# 1 Oct SB 1 12 3 41 6
# 2 Oct SB 2 12 3 26 6
FYI, summarize_at
is "superseded" by across
. Notice now the change occurs: use summarize
as usual, use across
but not assigned to something, by itself; first arg to across is a set of vars to choose, using similar methods as select
including c(col1, col2)
, starts_with("num")
, and negation of those options; the second argument is one or more functions in various ways, similar to summarize_at
's function argument(s). See the colwise
vignette for more details.