I've got a dataframe with a column full of pixel coordinates. I want to remove the 'px' from these values so that I can make the entire column numeric without introducing NAs.
> print(data_exp_59965_v11_task_j84b$`X Coordinate`)
[1] NA NA NA NA NA NA NA NA NA
[10] NA NA NA NA NA NA NA NA NA
[19] NA NA "-401.222px" "401.222px" "-200.611px" "347.458px" "200.611px" "347.458px" "-200.611px"
[28] "-347.458px" "200.611px" "-347.458px" NA
CodePudding user response:
library(tidyverse)
data_exp_59965_v11_task_j84b %>%
mutate(`X Coordinate` = as.numeric(str_replace_all(`X Coordinate`, "px$", "")))
Output
X Coordinate
1 -401.222
2 401.222
3 -200.611
4 NA
5 NA
Data
data_exp_59965_v11_task_j84b <- structure(list(`X Coordinate` = c("-401.222px", "401.222px",
"-200.611px", NA, NA)), class = "data.frame", row.names = c(NA,
-5L))
CodePudding user response:
You could use sub
:
df$`X Coordinate` <- as.numeric(sub("px$", "", df$`X Coordinate`, fixed=TRUE))
More generally, you might try:
df$`X Coordinate` <- as.numeric(sub(".*?(-?\\d (?:\\.\\d )?).*", "\\1", df$`X Coordinate`))
This option would capture every number, excluding any remaining content.
CodePudding user response:
Perfect use case for parse_number
from readr
it is in tidyverse
:
Data from @AndrewGB (many thanks)
library(dplyr)
library(readr)
data_exp_59965_v11_task_j84b %>%
mutate(`X Coordinate` = parse_number(`X Coordinate`))
X Coordinate
1 -401.222
2 401.222
3 -200.611
4 NA
5 NA