I have a dataset with different types of observations across several "transects". Still pretty new to R, and struggling with the below issue...
I need to calculate the number of "nest" observations in each transect, but I am getting an error that makes me think maybe I am not using the correct function? In the end, I want to create a new column called "nest_number" which has the sum of the number of observations equal to nest.
The data is in this format:
transect | observation |
---|---|
1A | nest |
1A | NA |
1A | nest |
1A | vocalization |
1A | NA |
2A | nest |
2A | NA |
... | ... |
Here is how I need the output to look:
transect | observation | nest_number |
---|---|---|
1A | nest | 2 |
1A | NA | 2 |
1A | nest | 2 |
1A | vocalization | 2 |
1A | NA | 2 |
2A | nest | 1 |
2A | NA | 1 |
... | ... | ... |
Here is the code I used
dfNew <- df %>%
group_by(transect) %>%
mutate(number_nests = colSums(observation == "nest", na.rm = TRUE))
The error I get is:
'x' must be an array of at least two dimensions The error occurred in group 1: transect = "1A".
CodePudding user response:
It should be sum
and not colSums
because colSums
expect a data.frame/matrix
, but here we are doing the sum
on a logical vector (observation == "nest"
)
library(dplyr)
df %>%
group_by(transect) %>%
mutate(nest_number = sum(observation == "nest", na.rm = TRUE)) %>%
ungroup
-output
# A tibble: 7 × 3
transect observation nest_number
<chr> <chr> <int>
1 1A nest 2
2 1A <NA> 2
3 1A nest 2
4 1A vocalization 2
5 1A <NA> 2
6 2A nest 1
7 2A <NA> 1