Home > OS >  Count a string & sum them in a new column in R using dplyr?
Count a string & sum them in a new column in R using dplyr?

Time:02-26

I have a dataset with different types of observations across several "transects". Still pretty new to R, and struggling with the below issue...

I need to calculate the number of "nest" observations in each transect, but I am getting an error that makes me think maybe I am not using the correct function? In the end, I want to create a new column called "nest_number" which has the sum of the number of observations equal to nest.

The data is in this format:

transect observation
1A nest
1A NA
1A nest
1A vocalization
1A NA
2A nest
2A NA
... ...

Here is how I need the output to look:

transect observation nest_number
1A nest 2
1A NA 2
1A nest 2
1A vocalization 2
1A NA 2
2A nest 1
2A NA 1
... ... ...

Here is the code I used

dfNew <- df %>%
  group_by(transect) %>%
  mutate(number_nests = colSums(observation == "nest", na.rm = TRUE))

The error I get is:

'x' must be an array of at least two dimensions The error occurred in group 1: transect = "1A".

CodePudding user response:

It should be sum and not colSums because colSums expect a data.frame/matrix, but here we are doing the sum on a logical vector (observation == "nest")

library(dplyr)
df %>% 
  group_by(transect) %>% 
  mutate(nest_number = sum(observation == "nest", na.rm = TRUE)) %>%
  ungroup

-output

# A tibble: 7 × 3
  transect observation  nest_number
  <chr>    <chr>              <int>
1 1A       nest                   2
2 1A       <NA>                   2
3 1A       nest                   2
4 1A       vocalization           2
5 1A       <NA>                   2
6 2A       nest                   1
7 2A       <NA>                   1
  • Related