Home > database >  R, find pattern and create index, then filter another column based on the created index
R, find pattern and create index, then filter another column based on the created index

Time:03-17

I have a value df df1 = data.frame(ID = c("A1; A2; A3; A4", "B1; B2; B3", "C1; C2","D1"), Value = c("1; 2; 3; 4", "5; 6; 7", "8; 9", "10")).

I have another df df2 = data.frame(ID = c("A2", "B3", "C2", "D1")).

Now I would like to write a function to map the 'ID' from df2 to 'ID' in df1, then filter the Value based on the ith position of ID column. The expected output is filtered_df1 = data.frame(ID=c("A2", "B3", "C2", "D1"), value = c("2", "7", "9", "10")).

Would you be able to give any suggestion to create R function?

Thank you so much!

CodePudding user response:

We could use separate_rows function from tidyr package combined with right_join to get numeric for Value add type.convert(as.is = TRUE)

library(dplyr)
library(tidyr)

df1 %>% 
  separate_rows(c(ID, Value)) %>% 
  right_join(df2) %>%
  type.convert(as.is = TRUE)
  ID    Value
  <chr> <int>
1 A2        2
2 B3        7
3 C2        9
4 D1       10

CodePudding user response:

This is a weird structure to encounter in the wild, it is inadvisable to store key-value pairs this way. First, fixing df1 to be humanly usable.

library(tidyverse)
f <- function(x, y) {
  x %>%
    separate_rows(ID, Value, sep = "; ") %>%
    right_join(y, by = "ID") %>%
    mutate(Value = as.numeric(Value))
}

f(df1, df2)
  ID    Value
  <chr> <dbl>
1 A2        2
2 B3        7
3 C2        9
4 D1       10
  • Related