I have this table
subs_id | amount | flag |
---|---|---|
1 | 15 | target |
1 | 10 | taker |
2 | 30 | target |
3 | 20 | taker |
3 | 10 | target |
I want to create a new table that does the following:
- calculate the total sum of the variable amount by each subs_id
- a column that just shows the value of the variable amount when the subs_id has the variable flag equal to "taker", and 0 otherwise.
The resulting table should look like this:
subs_id | ttl_amount | amount |
---|---|---|
1 | 25 | 10 |
2 | 30 | 0 |
3 | 30 | 20 |
Here is what I tried to get the result:
df%>%
group_by(subs_id)%>%
summarise(ttl_amount=sum(amt),
amount=case_when(flag=="taker"~amt[which(flag=="taker")[1]],TRUE~0))
This gives the following error: ! 'names' attribute [1] must be the same length as the vector [0]
Note that the solution should preferably be using summarise, as multiple other aggregation will happen in it.
CodePudding user response:
You could try:
df %>%
group_by(subs_id) %>%
summarise(
ttl_amount = sum(amount),
amount = coalesce(amount[flag == 'taker'][1L], 0L)
)
Output:
# A tibble: 3 x 3
subs_id ttl_amount amount
<int> <int> <int>
1 1 25 10
2 2 30 0
3 3 30 20
CodePudding user response:
You can multiply by a logical vector to get a conditional sum:
library(dplyr)
df %>%
group_by(subs_id) %>%
summarize(
ttl_amount = sum(amount),
amount = sum(amount * (flag == "taker"))
)
# A tibble: 3 × 3
subs_id ttl_amount amount
<dbl> <dbl> <dbl>
1 1 25 10
2 2 30 0
3 3 30 20
CodePudding user response:
Same as yours, but use left_join
afterwards to add the amount
column from the original
library(tidyverse)
t <- tibble(subs_id=c(1,1,2,3,3),
amount=c(15,10,30,20,10),
flag=c("target", "taker", "target", "taker", "target"))
t %>%
group_by(subs_id)%>%
summarise(ttl_amount=sum(amount)) %>%
left_join(select(filter(t, flag=="taker"), "subs_id", "amount", "flag"), by="subs_id") %>%
select(-flag) %>%
replace_na(list(amount=0))
yields:
subs_id ttl_amount amount
1 25 10
2 30 0
3 30 20
CodePudding user response:
Here is a simplified answer to the accepted answer:
library(tidyverse)
df |>
group_by(subs_id) |>
summarise(
ttl_amount = sum(amount),
amount= sum(amount[flag == "taker"]))
#> # A tibble: 3 x 3
#> subs_id ttl_amount amount
#> <dbl> <dbl> <dbl>
#> 1 1 25 10
#> 2 2 30 0
#> 3 3 30 20
CodePudding user response:
Here's how I would do it:
library(dplyr)
subs_id <- c(1, 1, 2, 3, 3)
amount <- c(15, 10, 30, 20, 10)
flag <- c("target", "taker", "target", "taker", "target")
data <- data.frame(subs_id, amount, flag)
data_new <- data %>% group_by(subs_id) %>%
mutate(column_summary = sum(amount))
data_new$column_id <- ifelse(data_new$flag == "taker", data_new$amount, 0)