Home > Net >  Compare string column to list
Compare string column to list

Time:07-25

My data frame has two string columns I want to compare. The second (V2) is a list. My DF looks like this:

V1           V2                                                        V3
oranges      c("oranges", "apples", "berries", "plums", "cherries")    1
apples       c("oranges", "apples", "berries", "bananas", "apples")    2
grapes       c("oranges", "apples", "berries", "plums", "cherries")    0
berries      c("berries", "apples", "berries", "plums", "cherries")    2

I want to check V1 row wise against V2 and total the frequency the string appears in V3. I have tried using the following code but end up with an empty dataframe.

matches <- x[!x$V1 %in% x$V2]

CodePudding user response:

V1 <- c("oranges", "apples", "grapes", "berries")
V2 <- list(c("oranges", "apples", "berries", "plums", "cherries"), 
    c("oranges", "apples", "berries", "bananas", "apples"), c("oranges", 
    "apples", "berries", "plums", "cherries"), c("berries", "apples", 
    "berries", "plums", "cherries"))

A straightforward solution is:

V3 <- mapply(function (x, y) sum(x == y), V1, V2)
#oranges  apples  grapes berries 
#      1       2       0       2 

Note that I could use ==, because V1 has single value each row.

If V2 has identical number of elements each row, I recommend:

V3 <- rowSums(V1 == do.call(rbind, V2))
#[1] 1 2 0 2

CodePudding user response:

library(tidyverse)

df <- tibble::tribble(
        ~V1,                                                    ~V2,
  "oranges", c("oranges", "apples", "berries", "plums", "cherries"),
   "apples", c("oranges", "apples", "berries", "bananas", "apples"),
   "grapes", c("oranges", "apples", "berries", "plums", "cherries"),
  "berries", c("berries", "apples", "berries", "plums", "cherries"),
  )

df %>% 
  rowwise() %>% 
  mutate(
    V3 = sum(V1 == V2)
  ) %>% 
  ungroup()

#> # A tibble: 4 × 3
#>   V1      V2           V3
#>   <chr>   <list>    <int>
#> 1 oranges <chr [5]>     1
#> 2 apples  <chr [5]>     2
#> 3 grapes  <chr [5]>     0
#> 4 berries <chr [5]>     2

Created on 2022-07-25 by the reprex package (v2.0.1)

  •  Tags:  
  • r
  • Related