For example, I have a df
as below, and I want to make a new column with specific string
df <- data.frame(name1 = c("Amydrium sp. 01", "Aporosa sp.", "Arecaceae sp. 02","Adenia macrophylla"))
name1
1 Amydrium sp. 01
2 Aporosa sp.
3 Arecaceae sp. 02
4 Adenia macrophylla
I would like to have 2 additional columns called family
and genus
I can make the family
column by detect ceae
string (i.e. df %>% mutate(family = case_when(str_detect(name1, "ceae") ~ name1))
.
For genus
column can I get a syntax as detect sp.
exclude ceae
, and dont want to to with several cell having full name i.e. Adenia macrophyll
same time to get the column?.
-Desired output
name1 family genus
1 Amydrium sp. 01 NA Amydrium sp. 01
2 Aporosa sp. NA Aporosa sp.
3 Arecaceae sp. 02 Arecaceae sp. 02 NA
4 Adenia macrophylla NA NA
CodePudding user response:
Just do the opposite and also look for sp.
:
df %>%
mutate(
family = case_when(str_detect(name1, "ceae") ~ name1),
genus = case_when((!str_detect(name1, "ceae") & str_detect(name1, "sp.")) ~ name1)
)
Output:
name1 family genus
1 Amydrium sp. 01 <NA> Amydrium sp. 01
2 Aporosa sp. <NA> Aporosa sp.
3 Arecaceae sp. 02 Arecaceae sp. 02 <NA>
4 Adenia macrophylla <NA> <NA>