Home > other >  How to assign one dataframe column's value to be the same as another column's value in r?
How to assign one dataframe column's value to be the same as another column's value in r?

Time:12-31

I am trying to run this line of code below to copy the city.output column to pm.city where it is not NA (in my sample dataframe, nothing is NA though) because city.output contains the correct city spellings.

resultdf <- dplyr::mutate(df, pm.city = ifelse(is.na(city.output) == FALSE, city.output, pm.city))

df:

  pm.uid pm.address                   pm.state pm.zip pm.city     city.output
   <int> <chr>                        <chr>    <chr>  <chr>       <fct>
1      1 1809 MAIN ST                 OH       63312  NORWOOD     NORWOOD
2      2 123 ELM DR                   CA       NA     BRYAN       BRYAN
3      3 8970 WOOD ST UNIT 4          LA       33333  BATEN ROUGE BATON ROUGE
4      4 4444 OAK AVE                 OH       87481  CINCINATTI  CINCINNATI
5      5 3333 HELPME DR               MT       87482  HELENA      HELENA
6      6 2342 SOMEWHERE RD            LA       45103  BATON ROUGE BATON ROUGE

resultdf (pm.city should be the same as city.output but it's an integer)

  pm.uid pm.address                   pm.state pm.zip pm.city city.output
   <int> <chr>                        <chr>    <chr>    <int> <fct>
1      1 1809 MAIN ST                 OH       63312        7 NORWOOD
2      2 123 ELM DR                   CA       NA           2 BRYAN
3      3 8970 WOOD ST UNIT 4          LA       33333        1 BATON ROUGE
4      4 4444 OAK AVE                 OH       87481        3 CINCINNATI
5      5 4444 HELPME DR               MT       87482        4 HELENA
6      6 2342 SOMEWHERE RD            LA       45103        1 BATON ROUGE

An integer is instead assigned to pm.city. It appears the integer is the order number of the cities when they're in alphabetical order. Prior to this, I used the dplyr left_join method to attach city.output column from another dataframe but even there, there was no row number that I supplied explicitly.

This works on my computer in r studio but not when I run it from a server. Maybe it has something to do with my version of dplyr or the factor data type under city.output? I am pretty new to r.

CodePudding user response:

The city.output is factor which gets coerced to integer storage values. Instead, convert to character with as.character

dplyr::mutate(df, pm.city = ifelse(!is.na(city.output), as.character(city.output), pm.city))
  • Related