Specifically I have an untidy data.frame with subspecies varieties in separate columns, like this;
# Data
Genus<- c("Metrosideros", "Gahnia", "Acacia")
Species<- c("polymorpha", "aspera", "koa")
Subspecies<- c("", "globosa","")
Variety<- c("glaberrima", "", "")
df<-data.frame(Genus, Species, Subspecies, Variety)
But I want a new column that looks like this;
df$Sciname<- c("Metrosideros polymorpha var. glaberrima",
"Gahnia aspera subsp. globosa",
"Acacia koa")
There is probably a clever solution using paste()
and ifelse()
but I cannot figure it out. If there is a tidyverse (dplyr) solution that is welcome. Thanks for any help!
CodePudding user response:
You can get there with paste()
and a little bit of indexing.
with(df, paste(
Genus,
Species,
c("", "subsp.")[(Subspecies != "") 1],
Subspecies,
c("", "var.")[(Variety != "") 1],
Variety
))
[1] "Metrosideros polymorpha var. glaberrima" "Gahnia aspera subsp. globosa " "Acacia koa "
You can use stringr::str_squish()
on the result to get rid of unwanted spaces which will give:
[1] "Metrosideros polymorpha var. glaberrima" "Gahnia aspera subsp. globosa" "Acacia koa"
CodePudding user response:
Here's another option with tidyverse
, where we can add the additional strings to the Subspecies
and Variety
columns, then we can use unite
to combine all columns. Then, we can clean up the Sciname
column then rejoin to the original dataframe.
library(tidyverse)
df %>%
mutate(Subspecies = ifelse(Subspecies != "", paste0("subsp. ", Subspecies), Subspecies),
Variety = ifelse(Variety != "", paste0("var. ", Variety), Variety)) %>%
unite("Sciname", Genus:Variety, sep = " ", remove = FALSE, na.rm = T) %>%
select(Sciname) %>%
mutate(Sciname = trimws(Sciname)) %>%
bind_cols(df, .)
Output
Genus Species Subspecies Variety Sciname
1 Metrosideros polymorpha glaberrima Metrosideros polymorpha var. glaberrima
2 Gahnia aspera globosa Gahnia aspera subsp. globosa
3 Acacia koa Acacia koa