Home > OS >  How to replicate Excel's index matching formula in R using dplyr?
How to replicate Excel's index matching formula in R using dplyr?

Time:07-13

I am a heavy user of Excel and am learning R and the easy-to-use R package dplyr. I frequently use Excel's index(...,match(...)) formula combination to pull in (look up) target values from a column. How would I perform the same thing, in R and using dplyr, as shown in the Excel illustration shown below? In R I am trying to replicate column G labeled "Match", and the formulas for that column are shown in the columns to its right highlighted yellow and labeled "Match Column Formula".

I am using the Excel and companion R code to show a series of step-wise calculations, in case that column G looks a but cumbersome!

enter image description here

Code to reproduce the starting dataframe in this example:

myData <- 
  data.frame(
    Element = c("A","A","C","A","B","B"),
    Code1 = c(0,0,0,0,1,1),
    Code2 = c(1,2,1,3,1,2),
    Code3 = c(0,0,0,0,1,2),
    Code4 = c(0,0,0,0,1.1,1.2)
  )

CodePudding user response:

Base R has a match function which works similar to the Excel one.

myData$Match <- with(myData, Code4[match(Code2, Code3)] * !Code1)

myData
#-----
  Element Code1 Code2 Code3 Code4 Match
1       A     0     1     0   0.0   1.1
2       A     0     2     0   0.0   1.2
3       C     0     1     0   0.0   1.1
4       A     0     3     0   0.0    NA
5       B     1     1     1   1.1   0.0
6       B     1     2     2   1.2   0.0

Same idea, but using dplyr

myData %>%
  mutate(Match = Code4[match(Code2, Code3)] * !Code1)

CodePudding user response:

One can do such things with data base operations. One approach is like follows:

  1. Split the table in two tables
  2. Use left_join to match tables
  3. replace with zero where appropriate
library(dplyr)

tbl1 <- select(myData, Element, Code1, Code2, Code3)
tbl2 <- select(myData, Code3, Code4)

left_join(tbl1, tbl2, by=c(Code2 = "Code3")) %>% 
  mutate(Code4 = ifelse(Code1 != 0, 0, Code4))

Or in a single pipeline:

myData %>% 
  select(Element, Code1, Code2, Code3) %>%
  left_join(select(myData, Code3, Code4), by = c(Code2 = "Code3"))  %>% 
  mutate(Code4 = ifelse(Code1 != 0, 0, Code4))

Handling of a "Position location" other then 1 can finally be implemented by shifting the result.

  • Related