I am trying to run this line of code below to copy the city.output column to pm.city where it is not NA (in my sample dataframe, nothing is NA though) because city.output contains the correct city spellings.
resultdf <- dplyr::mutate(df, pm.city = ifelse(is.na(city.output) == FALSE, city.output, pm.city))
df:
pm.uid pm.address pm.state pm.zip pm.city city.output
<int> <chr> <chr> <chr> <chr> <fct>
1 1 1809 MAIN ST OH 63312 NORWOOD NORWOOD
2 2 123 ELM DR CA NA BRYAN BRYAN
3 3 8970 WOOD ST UNIT 4 LA 33333 BATEN ROUGE BATON ROUGE
4 4 4444 OAK AVE OH 87481 CINCINATTI CINCINNATI
5 5 3333 HELPME DR MT 87482 HELENA HELENA
6 6 2342 SOMEWHERE RD LA 45103 BATON ROUGE BATON ROUGE
resultdf (pm.city should be the same as city.output but it's an integer)
pm.uid pm.address pm.state pm.zip pm.city city.output
<int> <chr> <chr> <chr> <int> <fct>
1 1 1809 MAIN ST OH 63312 7 NORWOOD
2 2 123 ELM DR CA NA 2 BRYAN
3 3 8970 WOOD ST UNIT 4 LA 33333 1 BATON ROUGE
4 4 4444 OAK AVE OH 87481 3 CINCINNATI
5 5 4444 HELPME DR MT 87482 4 HELENA
6 6 2342 SOMEWHERE RD LA 45103 1 BATON ROUGE
An integer is instead assigned to pm.city. It appears the integer is the order number of the cities when they're in alphabetical order. Prior to this, I used the dplyr left_join method to attach city.output column from another dataframe but even there, there was no row number that I supplied explicitly.
This works on my computer in r studio but not when I run it from a server. Maybe it has something to do with my version of dplyr or the factor data type under city.output? I am pretty new to r.
CodePudding user response:
The city.output
is factor
which gets coerced to integer storage values. Instead, convert to character
with as.character
dplyr::mutate(df, pm.city = ifelse(!is.na(city.output), as.character(city.output), pm.city))