I have a data.frame
such as
data = data.frame(plot = c(1, 1, 1, 2, 2, 3, 3, 3, 3),
family = c("Fab", "Fab", "Fab", "Pip", "Fab", "Mel", "Myr", "Myr", "Fab"),
species = c("Fab", "Fab", "sp 1", "sp2", "Fab", "sp3", "sp4", "sp5", "sp1"))
What I'm trying to do is, if character names in columns family
and species
match by row, keep the name on family
and add NA
to the respective species
column cell. I was trying to loop but it doesn't seem like a worthy way to do this since my data
is pretty big...
CodePudding user response:
Using base R, you can assign NA
to the species column after filtering for your use case:
data <- data.frame(plot = c(1, 1, 1, 2, 2, 3, 3, 3, 3),
family = c("Fab", "Fab", "Fab", "Pip", "Fab", "Mel", "Myr", "Myr", "Fab"),
species = c("Fab", "Fab", "sp 1", "sp2", "Fab", "sp3", "sp4", "sp5", "sp1"),
stringsAsFactors = FALSE)
data[data$family == data$species, ]$species <- NA
data
#> plot family species
#> 1 1 Fab <NA>
#> 2 1 Fab <NA>
#> 3 1 Fab sp 1
#> 4 2 Pip sp2
#> 5 2 Fab <NA>
#> 6 3 Mel sp3
#> 7 3 Myr sp4
#> 8 3 Myr sp5
#> 9 3 Fab sp1
CodePudding user response:
library(tidyverse)
df %>%
mutate(species = case_when(species == family ~ NA_character_,
TRUE ~ species))
# A tibble: 9 × 3
plot family species
<dbl> <chr> <chr>
1 1 Fab NA
2 1 Fab NA
3 1 Fab sp 1
4 2 Pip sp2
5 2 Fab NA
6 3 Mel sp3
7 3 Myr sp4
8 3 Myr sp5
9 3 Fab sp1