Alternative method to count number of single occurencies across columns of interest-CodePudding

I would like the number of single occurrences of some rows values across different columns. I have applied the following code:

dat = data.frame()
vector <- c(1, 2, 3)
for (i in names(data)){
  for (j in vector){
    dat[j,i] <- length(which(data[,i] == j))
  }
}

print(dat)

That return exactly the output I am looking for. Does this code contain any redundancies? Could you please some more effective alternative way with the iterative method (including for loop) or with dplyr() packages?

Thanks

Here is a short extract of the dataset I am working on.

structure(list(run_set_1 = c(3, 3, 3, 3, 3, 3), run_set_2 = c(1, 
1, 1, 1, 1, 1), run_set_3 = c(2, 2, 2, 2, 2, 2)), row.names = c(NA, 
-6L), class = c("tbl_df", "tbl", "data.frame"))

CodePudding user response：

You could first match() each column to get the index in vector that the column values correspond to, if any. Then tabulate() those to get the counts, including 0s:

lapply(data, match, vector) |>
  sapply(tabulate, length(vector))
#>      run_set_1 run_set_2 run_set_3
#> [1,]         0         6         0
#> [2,]         0         0         6
#> [3,]         6         0         0

CodePudding user response：

here is the tidyverse version. I think it may be even shorter but I don't know yet.

library(dplyr)
library(tidyr)
data %>% pivot_longer(cols = everything()) %>%
    group_by(name, value) %>% count() %>% ungroup() %>%
    pivot_wider(names_from = name, values_from = n, values_fill = 0 ) %>%
    arrange(value) %>% select(-value)
# last line only to remove the value column and fit your example

# # A tibble: 3 × 3
#   run_set_1 run_set_2 run_set_3
#       <int>     <int>     <int>
# 1         0         6         0
# 2         0         0         6
# 3         6         0         0