Home > Enterprise >  select value contain specific string for new columns
select value contain specific string for new columns

Time:10-13

For example, I have a df as below, and I want to make a new column with specific string

df <- data.frame(name1 = c("Amydrium sp. 01", "Aporosa sp.", "Arecaceae sp. 02","Adenia macrophylla"))
             name1
1  Amydrium sp. 01
2      Aporosa sp.
3 Arecaceae sp. 02
4 Adenia macrophylla

I would like to have 2 additional columns called family and genus

I can make the family column by detect ceae string (i.e. df %>% mutate(family = case_when(str_detect(name1, "ceae") ~ name1)).

For genus column can I get a syntax as detect sp.exclude ceae, and dont want to to with several cell having full name i.e. Adenia macrophyll same time to get the column?.

-Desired output

             name1    family              genus
1  Amydrium sp. 01    NA                 Amydrium sp. 01
2      Aporosa sp.    NA                 Aporosa sp.
3 Arecaceae sp. 02    Arecaceae sp. 02   NA
4 Adenia macrophylla  NA                 NA

CodePudding user response:

Just do the opposite and also look for sp.:

df %>%
  mutate(
      family = case_when(str_detect(name1, "ceae")  ~ name1),
      genus  = case_when((!str_detect(name1, "ceae") & str_detect(name1, "sp.")) ~ name1)
  )

Output:

               name1           family           genus
1    Amydrium sp. 01             <NA> Amydrium sp. 01
2        Aporosa sp.             <NA>     Aporosa sp.
3   Arecaceae sp. 02 Arecaceae sp. 02            <NA>
4 Adenia macrophylla             <NA>            <NA>
  • Related