Home > front end >  How can I pull the most counted values for each level of a variable?
How can I pull the most counted values for each level of a variable?

Time:11-09

Original dataset

I want to get only the most counted values of a level in a variable? My code ise below:

`

a <- format_separated %>% 
  group_by(state, format) %>% 
  summarise(total = n(),
            .groups = "drop") %>% 
  arrange(desc(total)) 

`

State Format Total
California Public radio 25
New York Country 17
Ohio Classical 14
New York Public radio 12

(1015 entries)

But I just want to get the most counted format of each single state like this:

State Format Total
California Public radio 25
New York Country 17
Ohio Classical 14
Florida Public radio 11

(46 entries)

The final dataset I intend to obtain should include the 50 states of the US and should not be repeated.

CodePudding user response:

library(tidyverse)

df <- read_csv("https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2022/2022-11-08/state_stations.csv")

df %>%  
  count(state, format, sort = TRUE) %>%  
  group_by(state) %>%  
  slice_head() %>%  
  arrange(-n)

# A tibble: 50 x 3
# Groups:   state [50]
   state      format      n
   <chr>      <chr>   <int>
 1 Texas      Country   148
 2 California Variety   116
 3 Kentucky   Country    71
 4 Tennessee  Country    68
 5 Missouri   Country    66
 6 Minnesota  Country    62
 7 Illinois   Country    59
 8 New_York   Country    52
 9 Arkansas   Country    51
10 Georgia    Country    51
# ... with 40 more rows
# i Use `print(n = ...)` to see more rows
  • Related