Home > Software design >  How to select range of unique character values in dplyr?
How to select range of unique character values in dplyr?

Time:09-09

Let's say I have the following data frame:

#### Library ####
library(tidyverse)

#### Data Frame ####
df <- data.frame(name = c("Paul","Paul","Rich","Rich",
                          "John","John","Frank","Frank"),
                 cookies = c(3,6,4,3,
                             4,5,6,4),
                 fish = c(2,5,3,3,
                          4,6,7,3))

Filtering values is normally pretty easy, like so:

df %>% 
  filter(cookies < 5,
         fish == 3)

Which gives the following output:

   name cookies fish
1  Rich       4    3
2  Rich       3    3
3 Frank       4    3

However, I'm having an issue figuring out how to select from a range of unique character values. Lets say I want to only select the first two unique names in the data frame that show up, which should be Paul and Rich. If I try to filter these two people, I am unable to do so unless I explicitly specify them as such:

df %>% 
  filter(name %in% c("Paul","Rich"))

Which gets me what I want:

  name cookies fish
1 Paul       3    2
2 Paul       6    5
3 Rich       4    3
4 Rich       3    3

However, in the case where there are hundreds of names, what is an easier way to select the first two unique names in the data frame?

CodePudding user response:

Not entirely sure I understand what you're after but do you mean this?

library(dplyr)
df %>% filter(name %in% unique(name)[1:2])
#  name cookies fish
#1 Paul       3    2
#2 Paul       6    5
#3 Rich       4    3
#4 Rich       3    3
  • Related