Home > Enterprise >  How to create a new column from an existing column in r
How to create a new column from an existing column in r

Time:12-27

I have a column (called "thirdparty") that includes a list of 20 countries. Now I want to assign codes to each country in a new variable (calling it "ccode2"). I use the following code:

  df <- within(df, {
  ccode2 <- NA
  ccode2[thirdparty = "Australia"] <- "900"
  ccode2[thirdparty = "Austria"] <- "305"
  ccode2[thirdparty = "Belgium"] <- "211"
  ccode2[thirdparty = "Bulgaria"] <- "355"
  ccode2[thirdparty = "Canada"] <- "20"
  ccode2[thirdparty = "Croatia"] <- "344"
  ccode2[thirdparty = "Cyprus"] <- "352"
  ccode2[thirdparty = "Czech Republic"] <- "315"
  ccode2[thirdparty = "Denmark"] <- "390"
  ccode2[thirdparty = "Estonia"] <- "366"
  ccode2[thirdparty = "Finland"] <- "375"
  ccode2[thirdparty = "France"] <- "220"
  ccode2[thirdparty = "Germany"] <- "255"
  ccode2[thirdparty = "Greece"] <- "350"
  ccode2[thirdparty = "Hungary"] <- "310"
  ccode2[thirdparty = "Iceland"] <- "395"
  ccode2[thirdparty = "Ireland"] <- "205"
  ccode2[thirdparty = "Italy"] <- "325"
  ccode2[thirdparty = "Latvia"] <- "367"
  ccode2[thirdparty = "Lithuania"] <- "368"

However, it doesn't work. The error message is: Assigned data l must be compatible with existing data.

CodePudding user response:

Assuming that df is a dataframe there are many ways of doing this.

mutate()

I typically solve this using mutate() and case_when() from dplyr. Here is a repex:

library(dplyr)

mtcars %>% 
  mutate(
    new_column = case_when(
      cyl == 8 ~ "A",
      cyl == 6 ~ "B",
      cyl == 4 ~ "C",
      TRUE ~ NA_character_
    )
  )

You haven't included your data in your question, so I cannot be sure, but for you it should be something like:

library(dplyr)

df %>% 
  mutate(
    ccode2 = case_when(
      thirdparty == "Australia" ~ "900",
      thirdparty == "Austria" ~ "305",
      thirdparty == "Belgium" ~ "211",
      thirdparty == "Bulgaria" ~ "355",
      thirdparty == "Canada" ~ "20",
      thirdparty == "Croatia" ~ "344",
      thirdparty == "Cyprus" ~ "352",
      # And so on...
      TRUE ~ NA_character_
    )
  )

joining

In this case it might be better to create another dataframe containing the mapping between thirdparty and ccode2, and then join them together. Here is a repex:

library(dplyr)

x <- tibble::tribble(
  ~ "cyl", ~ "new_column",
        8,            "A",
        6,            "B",
        4,            "C"
)

mtcars %>% 
  left_join(x)

For you, this should be something like:

library(dplyr)

x <- tibble::tribble(
  ~ "thirdparty", ~ "ccode2",
     "Australia",      "900",
       "Austria",      "305",
       "Belgium",      "211",
      "Bulgaria",      "355",
        "Canada",       "20",
       "Croatia",      "344",
        "Cyprus"       "352"
)


df %>% 
  left_join(x)

  •  Tags:  
  • r
  • Related