R function for finding a specific repeat count for data.frame-CodePudding

I want to create a function that will give a new dataframe only with rows in which for the selected column the value is counted exactly 2 times in the original data.frame

I try this:

duplicates <- function(df$x, as.bool = TRUE) {
  is.dup <- (duplicated(x) & rev(duplicated(rev(x))))
  if (as.bool) { is.dup } else { x[is.dup] }
}

CodePudding user response：

In lack of data, I'm using the mtcars data. You can do:

duplicates <- function(data, var)
{
  library(tidyverse)
  data |> 
    add_count(!!sym(var)) |> 
    filter(n == 2) |> 
    select(-n)
}

duplicates(mtcars, "mpg")

    mpg cyl  disp  hp drat    wt  qsec vs am gear carb
1  21.0   6 160.0 110 3.90 2.620 16.46  0  1    4    4
2  21.0   6 160.0 110 3.90 2.875 17.02  0  1    4    4
3  22.8   4 108.0  93 3.85 2.320 18.61  1  1    4    1
4  21.4   6 258.0 110 3.08 3.215 19.44  1  0    3    1
5  22.8   4 140.8  95 3.92 3.150 22.90  1  0    4    2
6  19.2   6 167.6 123 3.92 3.440 18.30  1  0    4    4
7  15.2   8 275.8 180 3.07 3.780 18.00  0  0    3    3
8  10.4   8 472.0 205 2.93 5.250 17.98  0  0    3    4
9  10.4   8 460.0 215 3.00 5.424 17.82  0  0    3    4
10 30.4   4  75.7  52 4.93 1.615 18.52  1  1    4    2
11 15.2   8 304.0 150 3.15 3.435 17.30  0  0    3    2
12 19.2   8 400.0 175 3.08 3.845 17.05  0  0    3    2
13 30.4   4  95.1 113 3.77 1.513 16.90  1  1    5    2
14 21.4   4 121.0 109 4.11 2.780 18.60  1  1    4    2

In this case, each of the "mpg" values appears exactly two times in the data.

CodePudding user response：

oneDuplicate <- function(df, vec){
  
  
  if(is.numeric(vec)){
    ndf <- df[df[[vec]] %in% (which(table(df[[vec]]) == 2) |> names() |> as.numeric()),]
  } else {
    ndf <- df[df[[vec]] %in% (which(table(df[[vec]]) == 2) |> names()),]
  }
    
  return(ndf)


}

oneDuplicate(attitude, "advance")

    63         64         51       54     63       73      47
5      81         78         56       66     71       83      47
6      43         55         49       44     54       49      34
11     64         53         53       58     58       67      34
15     77         77         54       72     79       77      46
19     65         70         46       57     75       85      46
21     50         40         33       34     43       64      33
24     40         37         42       58     50       57      49
25     63         54         42       48     66       75      33
27     78         75         58       74     80       78      49