Home > front end >  R function to replace tricky merge in Excel (vlookup hlookup)
R function to replace tricky merge in Excel (vlookup hlookup)

Time:06-06

I have a tricky merge that I usually do in Excel via various formulas and I want to automate with R.

I have 2 dataframes, one called inputs looks like this:

id v1 v2 v3
1  A  A  C  
2  B  D  F
3  T  T  A
4  A  F  C 
5  F  F  F

And another called df

id v
1 1
1 2
1 3
2 2
3 1 

I would like to combined them based on the id and v values such that I get

id v  key
1  1  A
1  2  A
1  3  C
2  2  D
3  1  T 

So I'm matching on id and then on the column from v1 thru v2, in the first example you will see that I match id = 1 and v1 since the value of v equals 1. In Excel I do this combining creatively VLOOKUP and HLOOKUP but I want to make this simpler in R. Dataframe examples are simplified versions as the I have more records and values go from v1 thru up to 50.

Thanks!

CodePudding user response:

You can use two column matrices as index arguments to "[" so this is a one liner. (Not the names of the data objects are d1 and d2. I'd opposed to using df as a data object name.)

 d1[-1][ data.matrix(d2)]  # returns [1] "A" "A" "C" "D" "T"

So full solution is:

  cbind( d2, key= d1[-1][ data.matrix(d2)] )
  id v key
1  1 1   A
2  1 2   A
3  1 3   C
4  2 2   D
5  3 1   T

CodePudding user response:

You could use pivot_longer:

library(tidyr)
library(dplyr)

key %>% pivot_longer(!id,names_prefix='v',names_to = 'v') %>% 
        mutate(v=as.numeric(v)) %>%
        inner_join(df)

Joining, by = c("id", "v")
# A tibble: 5 × 3
     id     v value
  <int> <dbl> <chr>
1     1     1 A    
2     1     2 A    
3     1     3 C    
4     2     2 D    
5     3     1 T 

Data:

key <- read.table(text="
id v1 v2 v3
1  A  A  C  
2  B  D  F
3  T  T  A
4  A  F  C 
5  F  F  F",header=T)

df <- read.table(text="
id v
1 1
1 2
1 3
2 2
3 1 ",header=T)

CodePudding user response:

Try this

x <- "
id v1 v2 v3
1  A  A  C  
2  B  D  F
3  T  T  A
4  A  F  C 
5  F  F  F
"
y <- "
id v
1 1
1 2
1 3
2 2
3 1 
"

df <- read.table(textConnection(x) , header = TRUE)
df2 <- read.table(textConnection(y) , header = TRUE)

key <- c()

for (i in 1:nrow(df2)) {
  key <- append(df[df2$id[i],(df2$v[i]   1L)] , key)
}


df2$key <- rev(key)

df2
>#   id v key
># 1  1 1   A
># 2  1 2   A
># 3  1 3   C
># 4  2 2   D
># 5  3 1   T

Created on 2022-06-06 by the reprex package (v2.0.1)

  • Related