Home > database >  If/Then statement in R based on duplicate values
If/Then statement in R based on duplicate values

Time:04-25

I am looking to return a value in a new column based on what I have below. I am looking to say "if ANY value repeats 5 times in columns 2-9, then return that specific value in a new column 10." If no value repeats 5 times, then return NA. Data example is below.

              EntryName Team1 Team2 Team3 Team4 Team5 Team6 Team7 Team8
1                a   MIN    SF    SF    SF   ATL   TOR    SF    SF
2                b   MIN    SF    SF    SF    SF    SF   DET   MIA
3                c   MIN   CWS    SF   MIA   ATL   MIA   TOR    SF
4                d    SF    SF    SF    SF   TOR   TOR    SF   MIN
5                e   MIN   TOR   ATL   ATL    SF   CIN   DET    TB

CodePudding user response:

dat$newcol <- apply(dat[,-1], 1, function(z) {
  z <- sort(table(z), decreasing = TRUE)
  if (z[1] >= 5) names(z)[1] else ""[NA]
})
dat
#   EntryName Team1 Team2 Team3 Team4 Team5 Team6 Team7 Team8 newcol
# 1         a   MIN    SF    SF    SF   ATL   TOR    SF    SF     SF
# 2         b   MIN    SF    SF    SF    SF    SF   DET   MIA     SF
# 3         c   MIN   CWS    SF   MIA   ATL   MIA   TOR    SF   <NA>
# 4         d    SF    SF    SF    SF   TOR   TOR    SF   MIN     SF
# 5         e   MIN   TOR   ATL   ATL    SF   CIN   DET    TB   <NA>

Data

dat <- structure(list(EntryName = c("a", "b", "c", "d", "e"), Team1 = c("MIN", "MIN", "MIN", "SF", "MIN"), Team2 = c("SF", "SF", "CWS", "SF", "TOR"), Team3 = c("SF", "SF", "SF", "SF", "ATL"), Team4 = c("SF", "SF", "MIA", "SF", "ATL"), Team5 = c("ATL", "SF", "ATL", "TOR", "SF"), Team6 = c("TOR", "SF", "MIA", "TOR", "CIN"), Team7 = c("SF", "DET", "TOR", "SF", "DET"), Team8 = c("SF", "MIA", "SF", "MIN", "TB")), row.names = c("1", "2", "3", "4", "5"), class = "data.frame")

CodePudding user response:

library(tidyverse)
df %>%
  rowid_to_column('rn') %>%
  left_join(pivot_longer(.,-c(EntryName, rn)) %>%
  group_by(rn, EntryName) %>%
  count(value) %>%
  filter(n>=5))

  rn EntryName Team1 Team2 Team3 Team4 Team5 Team6 Team7 Team8 value  n
1  1         a   MIN    SF    SF    SF   ATL   TOR    SF    SF    SF  5
2  2         b   MIN    SF    SF    SF    SF    SF   DET   MIA    SF  5
3  3         c   MIN   CWS    SF   MIA   ATL   MIA   TOR    SF  <NA> NA
4  4         d    SF    SF    SF    SF   TOR   TOR    SF   MIN    SF  5
5  5         e   MIN   TOR   ATL   ATL    SF   CIN   DET    TB  <NA> NA

Look at the value column. Second to last

  • Related