Home > Software design >  Match character vector with a dataframe and add a new column
Match character vector with a dataframe and add a new column

Time:11-01

I have a dataframe df;

ID mp1 mp2 mp3 mp4
C1 25 28 32 37
C2 24 45 38 0
C3 28 33 24 20

A character vector:

vec = c('32','35','28')

and I want to match the character vector vec values with a data frame df such as if one or more value match it add a new column that contain 0 and 1 (O for not match and 1 for match)

ID mp1 mp2 mp3 mp4 dec
C1 25 28 32 37 1
C2 24 45 38 0 0
C3 28 33 24 20 1

CodePudding user response:

df$dec <-  (rowSums(sapply(df[,-1], `%in%`, vec)) > 0)
df
#   ID mp1 mp2 mp3 mp4 dec
# 1 C1  25  28  32  37   1
# 2 C2  24  45  38   0   0
# 3 C3  28  33  24  20   1
  • df[,-1] is one way to compare just the columns we need; we could also have done df[,2:5] or one of several other column-selection tools.

  • sapply(...) returns a matrix of logical, if the specific element is found in vec:

    sapply(df[,-1], `%in%`, vec)
    #        mp1   mp2   mp3   mp4
    # [1,] FALSE  TRUE  TRUE FALSE
    # [2,] FALSE FALSE FALSE FALSE
    # [3,]  TRUE FALSE FALSE FALSE
    
  • logicals, when summed, convert to integers with FALSE being 0 and TRUE being 1, so your logic of "one or more" means that the sum across a row is greater than 0:

    rowSums(sapply(df[,-1], `%in%`, vec)) > 0
    # [1]  TRUE FALSE  TRUE
    
  • (...) is a trick to convert logicals to their integer equivalents (akin to the previous bullet).


Data

df <- structure(list(ID = c("C1", "C2", "C3"), mp1 = c("25", "24", "28"), mp2 = c("28", "45", "33"), mp3 = c("32", "38", "24"),     mp4 = c("37", "0", "20")), class = "data.frame", row.names = c(NA, -3L))

CodePudding user response:

An alternative solution in base R:

f <- function(x) 1 * (sum(is.na(match(vec, x))) < 3)

df <- cbind(df, dec = apply(df, 1, f))
  • Related