I have the sales and cost data by models. The code below select TOP10 models by sales and all the rest are sum up in new category "Others" which is 11th row.
df <- data.frame (model = c("A","B","C","D","E","F","G","H","I","J","K","L","M","N"),
sale = c(100,300,140,456,345,456,456,780,40,560,560,456,350,500),
cost = c(1340,330,440,443,365,437,478,700,30,460,160,456,650,100))
#TOP10 by sale
order <- df %>%
type.convert(as.is = TRUE) %>%
mutate(pos = row_number(desc(sale)),
model = ifelse(pos>10, 'Others', model),
pos = ifelse(pos>10, 11, pos)) %>%
group_by(model, pos) %>%
summarise(cost= sum(cost), sale= sum(sale), .groups = 'drop') %>%
arrange(pos)
Output:
model pos cost sale
1 H 1 700 780
2 J 2 460 560
3 K 3 160 560
4 N 4 100 500
5 D 5 443 456
6 F 6 437 456
7 G 7 478 456
8 L 8 456 456
9 M 9 650 350
10 E 10 365 345
11 Others 11 2140 580
The sale of category A is put in "other" since it has low a sale (100) and is not in TOP10. Now, I want to include A in this TOP 10 in any case, no matter how much sale it has. So output should be TOP9 'A' 'Others':
Expected output:
model pos cost sale
1 H 1 700 780
2 J 2 460 560
3 K 3 160 560
4 N 4 100 500
5 D 5 443 456
6 F 6 437 456
7 G 7 478 456
8 L 8 456 456
9 M 9 650 350
10 A 10 100 1340
11 Others 11 2140 580
(Thus change must be done in the given code)
CodePudding user response:
One way could be making use of bind_rows
after removing the 10th line and adding only where model
== A:
library(tidyverse)
#TOP10 by sale
df %>%
type.convert(as.is = TRUE) %>%
mutate(pos = row_number(desc(sale)),
model = ifelse(pos>10, 'Others', model),
pos = ifelse(pos>10, 11, pos)) %>%
group_by(model, pos) %>%
summarise(cost= sum(cost), sale= sum(sale), .groups = 'drop') %>%
arrange(pos) %>%
slice(-10) %>%
bind_rows(df %>%
filter(model == "A")) %>%
mutate(pos = replace_na(pos, 10)) %>%
arrange(pos)
model pos cost sale
<chr> <dbl> <dbl> <dbl>
1 H 1 700 780
2 J 2 460 560
3 K 3 160 560
4 N 4 100 500
5 D 5 443 456
6 F 6 437 456
7 G 7 478 456
8 L 8 456 456
9 M 9 650 350
10 A 10 1340 100
11 Others 11 2140 580
CodePudding user response:
You can mutate in 2 steps, before grouping and manipulate the pos variable to fix the order. This solves the problem in the comments to the other answer.
order <- df %>%
type.convert(as.is = TRUE) %>%
mutate(pos = row_number(desc(sale))) %>%
mutate(pos = ifelse(model == "A" & pos > 10, 11, ifelse(pos > 10, 12, pos)),
model = ifelse(pos>11, 'Others', model)) %>%
group_by(model, pos) %>%
summarise(cost= sum(cost), sale= sum(sale), .groups = 'drop') %>%
arrange(pos)