My dataframe looks something like df
, and I would like to replace the values in X1 using a conversion table replacement_table
. If values in X1 match the values in replacement_table$coordinates
, then that value would be replaced with the corresponding value in replacement_table$location
replacement_table <- data.frame (location = c("locationA", "locationB", "locationC"),
coordinates = c("42.8 -70.89", "42.85 -72.21", "31.28 -72.14"))
df <- data.frame(X1 = c("42.8 -70.89", "42.8 -70.89", "31.28 -72.14", "31.28 -72.14", "42.85 -72.21"),
X2 = c("23.2 -81.32", "42.8 -70.89", "31.28 -72.14", "23.2 -81.32", "42.8 -70.89"),
X3 = c("19371", "0", "0", "19278", "28912"))
The output would look something like this:
df <- data.frame(X1 = c("locationA", "locationA", "locationC", "locationC", "locationB"),
X2 = c("23.2 -81.32", "42.8 -70.89", "31.28 -72.14", "23.2 -81.32", "42.8 -70.89"),
X3 = c("19371", "0", "0", "19278", "28912"))
CodePudding user response:
You should join both dataframes to make it work. First you join using the left_join function after changing the name of the column X1 to coordinates:
library(tidyverse)
df = df %>%
dplyr::rename('coordinates' = 'X1') %>%
dplyr::left_join(replacement_table, by = 'coordinates')
The new df will now have data frame location and coordinates. Now we can apply a "case_when" (works as a if else) function to transform the coordinates in location if the location exists:
df = df %>%
dplyr::mutate(coordinates = case_when(!is.na(location) ~ location))
To transform back the column name to X1 you rename again the column coordinates and remove the column location.
df = df %>%
dplyr::rename('X1' = 'coordinates') %>%
dplyr::select(-location)
CodePudding user response:
What about using stringr::str_replace_all()
?
You need to use a replacement vector, rather than a table, but the difference seems to minor. Also, you need to wrap the replacement vector in stringr::str_fixed()
to prevent issues with special characters such as " ".
df <- data.frame(X1 = c("42.8 -70.89", "42.8 -70.89", "31.28 -72.14", "31.28 -72.14", "42.85 -72.21"),
X2 = c("23.2 -81.32", "42.8 -70.89", "31.28 -72.14", "23.2 -81.32", "42.8 -70.89"),
X3 = c("19371", "0", "0", "19278", "28912"))
replacement_vector <- c("locationA", "locationB", "locationC")
names(replacement_vector) <- c("42.8 -70.89", "42.85 -72.21", "31.28 -72.14")
df$X1 <- stringr::str_replace_all(string = df$X1, stringr::fixed(replacement_vector))
df
#> X1 X2 X3
#> 1 locationA 23.2 -81.32 19371
#> 2 locationA 42.8 -70.89 0
#> 3 locationC 31.28 -72.14 0
#> 4 locationC 23.2 -81.32 19278
#> 5 locationB 42.8 -70.89 28912
Created on 2022-06-23 by the reprex package (v2.0.1)