Here is the dataset
data <- read_csv("https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2022/2022-01-18/chocolate.csv")
Step 1: Find the company manufacturer that has more than 10 ratings. So I need to count how many ratings each company manufacturer have and filter to just take those who have 10 or more amount of ratings.
data %>%
group_by(company_manufacturer) %>%
summarise(count(rating, na.rm=TRUE) >= 10)
Step 2: Mutate another two column which consists of their the mean rating and standard deviation of each company_manufacturer.
CodePudding user response:
Something like this?
library(dplyr)
data %>%
group_by(company_manufacturer) %>%
summarise(average_rating = mean(rating, na.rm = TRUE),
sd_rating = sd(rating, na.rm = TRUE),
n = n()) %>%
filter(n >= 10)
company_manufacturer average_rating sd_rating n
<chr> <dbl> <dbl> <int>
1 A. Morin 3.42 0.417 26
2 Altus aka Cao Artisan 2.86 0.282 11
3 Amedei 3.31 0.356 13
4 Arete 3.53 0.322 32
5 Artisan du Chocolat 3.08 0.663 16
6 Bittersweet Origins 3.27 0.317 14
7 Bonnat 3.47 0.560 30
8 Brasstown aka It's Chocolate 3.55 0.292 11
9 Cacao de Origen 3.12 0.429 10
10 Castronovo 3.38 0.436 19
# ... with 43 more rows
CodePudding user response:
You could use ave
to calculate an appropriate score and then subset
the dataset.
data <- transform(data, SCORE=ave(rating, company_manufacturer,
FUN=\(x) if (length(x) < 10) 0 else mean(x)/sd(x)))
res <- subset(data, SCORE == max(SCORE))
res
# ref company_manufacturer company_location review_date country_of_bean_origin specific_bean_origin_or_bar_name cocoa_percent
# 1498 895 Marou Vietnam 2012 Vietnam Tien Giang 80%
# 1499 845 Marou Vietnam 2012 Vietnam Ba Ria 76%
# 1500 845 Marou Vietnam 2012 Vietnam Dong Nai 72%
# 1501 845 Marou Vietnam 2012 Vietnam Tien Giang, Gao Co-op 70%
# 1502 849 Marou Vietnam 2012 Vietnam Ben Tre 78%
# 1503 955 Marou Vietnam 2012 Vietnam Lam Dong 74%
# 1504 1149 Marou Vietnam 2013 Vietnam Tan Phu Dong, Treasure Island 75%
# 1505 1650 Marou Vietnam 2015 Vietnam Tan Phu Dong Island, Heart of Darkness 85%
# 1506 1650 Marou Vietnam 2015 Vietnam Ben Tre 68%
# 1507 1650 Marou Vietnam 2015 Vietnam Dak Lak, Batch 2451 70%
# 1508 2258 Marou Vietnam 2018 Vietnam Dak Nong, Tam Farm 75%
# ingredients most_memorable_characteristics rating SCORE
# 1498 3- B,S,C creamy, fatty, black pepper 3.00 18.40566
# 1499 3- B,S,C citrus 3.50 18.40566
# 1500 3- B,S,C roasty, coffee 3.50 18.40566
# 1501 3- B,S,C cocoa, spice, late sour 3.50 18.40566
# 1502 3- B,S,C sticky, pepper, cinamon, rich 3.50 18.40566
# 1503 3- B,S,C melon, roasty 3.50 18.40566
# 1504 3- B,S,C fatty, heavy roast, nutty 3.50 18.40566
# 1505 2- B,S long, mild honey, rich cocoa 3.25 18.40566
# 1506 2- B,S roasty, rich, coffee, nut 3.50 18.40566
# 1507 2- B,S spicy and fragrant 3.75 18.40566
# 1508 3- B,S,C fatty, roasty, nutty 3.50 18.40566