Home > Net >  Filter variables with conditions from a dataset
Filter variables with conditions from a dataset

Time:08-21

Here is the dataset

data <- read_csv("https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2022/2022-01-18/chocolate.csv")

Step 1: Find the company manufacturer that has more than 10 ratings. So I need to count how many ratings each company manufacturer have and filter to just take those who have 10 or more amount of ratings.

data %>% 
  group_by(company_manufacturer) %>% 
  summarise(count(rating, na.rm=TRUE) >= 10)

Step 2: Mutate another two column which consists of their the mean rating and standard deviation of each company_manufacturer.

CodePudding user response:

Something like this?

library(dplyr)

data %>% 
  group_by(company_manufacturer) %>% 
  summarise(average_rating = mean(rating, na.rm = TRUE),
            sd_rating = sd(rating, na.rm = TRUE),
            n = n()) %>% 
  filter(n >= 10) 
company_manufacturer         average_rating sd_rating     n
   <chr>                                 <dbl>     <dbl> <int>
 1 A. Morin                               3.42     0.417    26
 2 Altus aka Cao Artisan                  2.86     0.282    11
 3 Amedei                                 3.31     0.356    13
 4 Arete                                  3.53     0.322    32
 5 Artisan du Chocolat                    3.08     0.663    16
 6 Bittersweet Origins                    3.27     0.317    14
 7 Bonnat                                 3.47     0.560    30
 8 Brasstown aka It's Chocolate           3.55     0.292    11
 9 Cacao de Origen                        3.12     0.429    10
10 Castronovo                             3.38     0.436    19
# ... with 43 more rows

CodePudding user response:

You could use ave to calculate an appropriate score and then subset the dataset.

data <- transform(data, SCORE=ave(rating, company_manufacturer, 
                                  FUN=\(x) if (length(x) < 10) 0 else mean(x)/sd(x)))

res <- subset(data, SCORE == max(SCORE))
res
#  ref company_manufacturer company_location review_date country_of_bean_origin       specific_bean_origin_or_bar_name cocoa_percent
# 1498  895                Marou          Vietnam        2012                Vietnam                             Tien Giang           80%
# 1499  845                Marou          Vietnam        2012                Vietnam                                 Ba Ria           76%
# 1500  845                Marou          Vietnam        2012                Vietnam                               Dong Nai           72%
# 1501  845                Marou          Vietnam        2012                Vietnam                  Tien Giang, Gao Co-op           70%
# 1502  849                Marou          Vietnam        2012                Vietnam                                Ben Tre           78%
# 1503  955                Marou          Vietnam        2012                Vietnam                               Lam Dong           74%
# 1504 1149                Marou          Vietnam        2013                Vietnam          Tan Phu Dong, Treasure Island           75%
# 1505 1650                Marou          Vietnam        2015                Vietnam Tan Phu Dong Island, Heart of Darkness           85%
# 1506 1650                Marou          Vietnam        2015                Vietnam                                Ben Tre           68%
# 1507 1650                Marou          Vietnam        2015                Vietnam                    Dak Lak, Batch 2451           70%
# 1508 2258                Marou          Vietnam        2018                Vietnam                     Dak Nong, Tam Farm           75%
#      ingredients most_memorable_characteristics rating    SCORE
# 1498    3- B,S,C    creamy, fatty, black pepper   3.00 18.40566
# 1499    3- B,S,C                         citrus   3.50 18.40566
# 1500    3- B,S,C                 roasty, coffee   3.50 18.40566
# 1501    3- B,S,C        cocoa, spice, late sour   3.50 18.40566
# 1502    3- B,S,C  sticky, pepper, cinamon, rich   3.50 18.40566
# 1503    3- B,S,C                  melon, roasty   3.50 18.40566
# 1504    3- B,S,C      fatty, heavy roast, nutty   3.50 18.40566
# 1505      2- B,S   long, mild honey, rich cocoa   3.25 18.40566
# 1506      2- B,S      roasty, rich, coffee, nut   3.50 18.40566
# 1507      2- B,S             spicy and fragrant   3.75 18.40566
# 1508    3- B,S,C           fatty, roasty, nutty   3.50 18.40566
  • Related