I have a df consisting of two columns:
df <- data.frame(Date = c("01-01-2016","02-01-2022","05-01-2022", "21-12-2022","03-09-2021", "21-12-2017"),
Value = c(14.2, 23.2, "bc", "bc", 78.2, "bc" ))
I want to count the sum of occurences of the word "bc" in the grouped by date, so tried the following:
df2 <- df %>% group_by(Date) %>% summarise(length(grep("bc", Value)))
but this gives me the total number of occurence of "bc" in the entire df which is 3
WHat I want is
**Expected output **
Date | bc_total |
---|---|
2022 | 2 |
2017 | 1 |
CodePudding user response:
library(dplyr) #1.1.0
library(lubridate)
df %>%
mutate(Date = year(dmy(Date))) %>%
summarise(bc_total = sum(Value == "bc"), .by = Date) %>%
filter(bc_total != 0)
# Date bc_total
#1 2022 2
#2 2017 1
Or
df %>%
mutate(Date = year(dmy(Date))) %>%
filter(Value == "bc") %>%
count(Date)
CodePudding user response:
You can use rowSums
and ifelse
to count the number of "bc" in each row, then summarize by grouping by year:
library(dplyr)
df$bc_count <- ifelse(df$Value == "bc", 1, 0)
df2 <- df %>% group_by(Year = format(as.Date(Date, "%d-%m-%Y"), "%Y")) %>%
summarize(bc_total = sum(bc_count))
Note: Make sure to convert the Date
column to date format using as.Date
with the correct format before grouping by year.
CodePudding user response:
Code
library(dplyr)
library(lubridate)
df <- data.frame(Date = c("01-01-2016","02-01-2022","05-01-2022", "21-12-2022","03-09-2021", "21-12-2017"),
Value = c(14.2, 23.2, "bc", "bc", 78.2, "bc" ))
df %>%
mutate(Year = year(dmy(Date))) %>%
group_by(Year,Value) %>%
summarise(Count=n()) %>%
as.data.frame()
Output
Year Value Count
1 2016 14.2 1
2 2017 bc 1
3 2021 78.2 1
4 2022 23.2 1
5 2022 bc 2
hope this helps :)