I have a dataset with food items, I need a function to format the different names of 1 fruit to one single name. For example, dataset has Red Apple, Dried Apple, Green Apple, I want all these to rename to Apple. There are many different types of food. I am new to R. I need a function that says rename value to Apple if it has Apple in it.
CodePudding user response:
It will be easier to show if you have some reproducible data. You can use gsub
with a pattern that removes all the characters before the space in for example "Red Apple" to "Apple. You can use the following code:
items <- c("Red Apple", "Dried Apple", "Green Apple")
gsub(".* ", "", items)
Output:
[1] "Apple" "Apple" "Apple"
As you can see, it returns for all Apple
.
CodePudding user response:
Is this what you're looking for:
dat <- data.frame(fruit=c("Red Apple", "Dried Apple", "Green Apple",
"Orange", "Dried Orange", "Pink Grapefruit",
"White Grapefruit", "Pear"))
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
library(stringr)
dat <- dat %>%
mutate(fruit_new = case_when(
str_detect(fruit, "Apple") ~ "Apple",
str_detect(fruit, "Orange") ~ "Orange",
str_detect(fruit, "Grapefruit") ~ "Grapefruit",
TRUE ~ fruit))
dat
#> fruit fruit_new
#> 1 Red Apple Apple
#> 2 Dried Apple Apple
#> 3 Green Apple Apple
#> 4 Orange Orange
#> 5 Dried Orange Orange
#> 6 Pink Grapefruit Grapefruit
#> 7 White Grapefruit Grapefruit
#> 8 Pear Pear
Created on 2022-04-23 by the reprex package (v2.0.1)
CodePudding user response:
A possible solution:
x <- c("Red Apple", "Dried Apple", "Green Apple", "Purple Grapes", "Green Grapes", "Banana")
sub(".*(Apple|Grapes).*", "\\1", x)
# [1] "Apple" "Apple" "Apple" "Grapes" "Grapes" "Banana"