Home > database >  New column containing string that appears the most in the row
New column containing string that appears the most in the row

Time:09-06

Im trying to Create a column with the string that appears the most in the row and create another column with the number of times this most prevalent string appeared.

To facilitate my question this is what im trying to achieve:

My actual DF

enter image description here

What im trying to obtain: most prevalente category and count

enter image description here

example df: d

f <- data.frame(ID = 1:4,
           V1 = c("A","B","C","D"),
           V2 = c("A", "B","D","B"),
           V3 = c("A","C","D","B"))

CodePudding user response:

Here is another way:

count <- sapply(apply(f[, -1], 1, table), max)
count
# [1] 3 2 2 2
category <- names(sapply(apply(f[, -1], 1, table), which.max))
category
# [1] "A" "B" "D" "B"
f2 <- data.frame(f, category, count)
f2
#   ID V1 V2 V3 category count
# 1  1  A  A  A        A     3
# 2  2  B  B  C        B     2
# 3  3  C  D  D        D     2
# 4  4  D  B  B        B     2

CodePudding user response:

df <- data.frame(ID = 1:4,
                V1 = c("A","B","C","D"),
                V2 = c("A", "B","D","B"),
                V3 = c("A","C","D","B"))


library(data.table)
setDT(df)
other <- melt(df, id.vars = "ID", measure.vars = c("V1", "V2", "V3"))
other <- other[, .N, by = .(ID, value)]
colnames(other) <- c("ID", "category", "count")
other <- other[, .SD[which.max(count)], by = .(ID)]

res <- merge(df, other, by = c("ID"))
res

CodePudding user response:

  • We can use dplyr rowwise function to apply table to each row from V1:V3
library(dplyr)

df |> rowwise() |> 
      mutate(category = names(table(c_across(V1:V3)))[which.max(table(c_across(V1:V3)))] ,
      count = max(table(c_across(V1:V3))))
  • Output
# A tibble: 4 × 6
# Rowwise: 
     ID V1    V2    V3    category count
  <int> <chr> <chr> <chr> <chr>    <int>
1     1 A     A     A     A            3
2     2 B     B     C     B            2
3     3 C     D     D     D            2
4     4 D     B     B     B            2
  • Related