Background
I've got a dataframe d
:
d <- data.frame(count_events = c(0,1,2,3),
count_people = c(123,56,2,8),
stringsAsFactors=FALSE)
> d
count_events count_people
1 0 123
2 1 56
3 2 2
4 3 8
The problem
I'd like a way to collapse rows 3 and 4 of d
into one row that says something like "2 or more" in count_events
and sums 2 and 8 into 10 for count_people
. So in other words something like this:
> d
count_events count_people
1 0 123
2 1 56
3 >=2 10
What I've tried
I can do what I want with rbind
, like so:
d <- d[-c(3:4),]
new_row <- c('2 or more', 10)
d <- rbind(d, new_row)
> d
count_events count_people
1 0 123
2 1 56
3 2 or more 10
But I'm wondering if there's something faster or more elegant. The "real" dataframe I'm trying to do this on has many rows, and many rows I'd like to collapse. It's doable, but it'll take me a decent chunk of time and I thought to myself "is there a dedicated function for something like this?". Thanks.
CodePudding user response:
You could do:
library(dplyr, warn = FALSE)
d |>
mutate(count_events = ifelse(row_number() >= 3, "2 or more", count_events)) |>
count(count_events, wt = count_people)
#> count_events n
#> 1 0 123
#> 2 1 56
#> 3 2 or more 10
CodePudding user response:
data.table equivalent to stefan's answer
library(data.table)
setDT(d)[, .(count_people = sum(count_people)),
keyby = .(count_events =
fifelse(count_events >= 2, "2 or more", as.character(count_events)))]
# count_events count_people
# 1: 0 123
# 2: 1 56
# 3: 2 or more 10