Home > Mobile >  Add column to dataframe based on information from another df2 in R
Add column to dataframe based on information from another df2 in R

Time:04-14

lets, say I have 2 dataframes like this:

Model <- c("H5", "H5", "H5","H4","H3")
Code <- c("001001", "001002","001003","001004","001005")
City <-  c("Mexico", "London", "NY", "Otawa", "Liverpool")

df1 <- data.frame(Model,Length,Code)


Model   Code       City
H5      001001     Mexico  
H5      001002     London
H5      001003     NY
H4      001004     Otawa
H3      001005     Liverpool

And

X <- c("030299", "010121","030448","030324","010245","001001", "001002","001003","001004","001005")
Y <- c("030344", "010222","030448","030001","010245","221001", "221044","221044","221004"," 001005")
Var1 <- c("H5", "H5", "H4","H4","H4","H5", "H5", "H5","H4","H3")
Var2 <- c("H4", "H2", "H4","H3","H4","H3", "H3", "H3","H3","H3")

  df2 <- data.frame(X,Y,Var1,Var2)

  X            Y     VAR1   VAR2
030299      030344    H5     H4
010121      010222    H5     H2
030448      030448    H4     H4
030324      030001    H4     H3
010245      010245    H4     H4
001001      221001    H5     H3
001002      221044    H5     H3
001003      221044    H5     H3
001004      221004    H4     H3
001005      001005    H3     H3

I want to code following:

For example if I select H3 as an argument in function, I want to take all values from 'Code' column in df1, take into account its corresponding value in 'Model' column and convert these value from 'Code' column based on df2 information. For example if we select the first row from df1 and set H3 as argument:

  H5      001001    Mexico 

function must take corresponding row from df2:

   X            Y     VAR1   VAR2
 001001      221001    H5     H3

and give me the output like this:

   X            Y    VAR2  City   
 001001      221001   H3   Mexico   

The final output should be like this:

  X            Y     VAR2   City 

001001      221001    H3   Mexico  
001002      221044    H3   London  
001003      221044    H3   NY
001004      221004    H3   Otawa  
001005      221056    H3   Liverpool 

CodePudding user response:

Maybe something to begin with, this reproduces the result of your example.

df2 %>% 
  left_join(df1, by = c( "Var1" = "Model", "X" = "Code")) %>% 
  filter(Var2 == "H3", !is.na(City)) %>% 
  select(-Var1)

       X       Y Var2      City
1 001001  221001   H3    Mexico
2 001002  221044   H3    London
3 001003  221044   H3        NY
4 001004  221004   H3     Otawa
5 001005  001005   H3 Liverpool

CodePudding user response:

Like this?

library(data.table)
setDT(df1);setDT(df2)
df2[df1, on = .(Var1 = Model, X = Code)]
#         X       Y Var1 Var2      City
# 1: 001001  221001   H5   H3    Mexico
# 2: 001002  221044   H5   H3    London
# 3: 001003  221044   H5   H3        NY
# 4: 001004  221004   H4   H3     Otawa
# 5: 001005  001005   H3   H3 Liverpool
  • Related