Home > Back-end >  Find reoccuring values in one column that correspond to differing values in another column
Find reoccuring values in one column that correspond to differing values in another column

Time:06-03

I have a dataframe with two columns. The first column ("A") contains numbers, the second ("B") letters:

A B
1 a
1 a
1 a
2 b
2 c
3 d
4 e
4 e
5 f
5 g
5 g
5 h

Most numbers are always matched with the same letter (e.g. "1" is always matched with "a"), but some numbers are matched with different letters (e.g. "2" is matched with "b" and "c"). I want to find the numbers that are matched with multiple letters. For the example, the result should be a vector containing "2" and "5".


Sample Data:

example <- read.table(textConnection('
A B
1 a
1 a
1 a
2 b
2 c
3 d
4 e
4 e
5 f
5 g
5 g
5 h
'), header = TRUE, colClasses=c("double", "character"))

CodePudding user response:

Same as @Paul's without the apply function

names(which(rowSums(table(example$A, example$B) != 0) > 1))

-output
>
[1] "2" "5"

CodePudding user response:

library(tidyverse)
> distinct(example) %>% group_by(A) %>%
    summarize(count = n()) %>%
    filter(count > 1)
# A tibble: 2 x 2
      A count
  <dbl> <int>
1     2     2
2     5     3

CodePudding user response:

Another possible solution, in base R:

as.numeric(names(which(apply(table(example$A, example$B), 1, 
     \(x) sum(x == 0) != (length(x)-1)))))

#> [1] 2 5
  •  Tags:  
  • r
  • Related