I am working with a data frame with more than 1000 rows and I want to create a new variable based on part of another variable string.
This is short version of the data but I want to extract the numbers from the 'id" variable and create the "height" variable. The data frame should look like something like this:
df<-data.frame(id=c("Necrosis_Char_cat_0.05m","Necrosis_Char_cat_0.1m",
"Necrosis_Char_cat_1.7m"),
height=c(0.05, 0.1, 1.7))
I tried to use this code:
df_new <- df%>%
mutate(height = as.numeric(str_replace(.id, ".*(\\d)(\\d )m.*", "\\1.\\2")))
But I get the following Warning message:
In eval(cols[[col]], .data, parent.frame()) : NAs introduced by coercion
In addition to the NAs, some of the values like 0.05 shows as 0.5. I believe the issue might be the way I am writing the pattern and/or replacement in str_replace(). Any help with that is very much appreciated. Thank you.
CodePudding user response:
There are probably a bunch of ways to do this, but here are a few:
library(tidyverse)
df<-data.frame(id=c("Necrosis_Char_cat_0.05m","Necrosis_Char_cat_0.1m",
"Necrosis_Char_cat_1.7m"),
height=c(0.05, 0.1, 1.7))
#option1
df |>
extract(id,
into = "new_height",
regex = ".*_(\\d \\.\\d )m",
remove = FALSE,
convert = TRUE)
#> id new_height height
#> 1 Necrosis_Char_cat_0.05m 0.05 0.05
#> 2 Necrosis_Char_cat_0.1m 0.10 0.10
#> 3 Necrosis_Char_cat_1.7m 1.70 1.70
#option 2
df |>
mutate(new_height = as.numeric(sub(".*_(\\d \\.\\d )m", "\\1", id)))
#> id height new_height
#> 1 Necrosis_Char_cat_0.05m 0.05 0.05
#> 2 Necrosis_Char_cat_0.1m 0.10 0.10
#> 3 Necrosis_Char_cat_1.7m 1.70 1.70
#option 3
df |>
mutate(new_height = as.numeric(str_extract(id, "\\d.*(?=m)")))
#> id height new_height
#> 1 Necrosis_Char_cat_0.05m 0.05 0.05
#> 2 Necrosis_Char_cat_0.1m 0.10 0.10
#> 3 Necrosis_Char_cat_1.7m 1.70 1.70
CodePudding user response:
library(dplyr)
library(readr)
df %>%
mutate(height2 = parse_number(id))
CodePudding user response:
df %>%
mutate(new_height = parse_number(id))
id height new_height
1 Necrosis_Char_cat_0.05m 0.05 0.05
2 Necrosis_Char_cat_0.1m 0.10 0.10
3 Necrosis_Char_cat_1.7m 1.70 1.70