Home > Net >  Alternative method to count number of single occurencies across columns of interest
Alternative method to count number of single occurencies across columns of interest


I would like the number of single occurrences of some rows values across different columns. I have applied the following code:

dat = data.frame()
vector <- c(1, 2, 3)
for (i in names(data)){
  for (j in vector){
    dat[j,i] <- length(which(data[,i] == j))


That return exactly the output I am looking for. Does this code contain any redundancies? Could you please some more effective alternative way with the iterative method (including for loop) or with dplyr() packages?


Here is a short extract of the dataset I am working on.

structure(list(run_set_1 = c(3, 3, 3, 3, 3, 3), run_set_2 = c(1, 
1, 1, 1, 1, 1), run_set_3 = c(2, 2, 2, 2, 2, 2)), row.names = c(NA, 
-6L), class = c("tbl_df", "tbl", "data.frame"))

CodePudding user response:

You could first match() each column to get the index in vector that the column values correspond to, if any. Then tabulate() those to get the counts, including 0s:

lapply(data, match, vector) |>
  sapply(tabulate, length(vector))
#>      run_set_1 run_set_2 run_set_3
#> [1,]         0         6         0
#> [2,]         0         0         6
#> [3,]         6         0         0

CodePudding user response:

here is the tidyverse version. I think it may be even shorter but I don't know yet.

data %>% pivot_longer(cols = everything()) %>%
    group_by(name, value) %>% count() %>% ungroup() %>%
    pivot_wider(names_from = name, values_from = n, values_fill = 0 ) %>%
    arrange(value) %>% select(-value)
# last line only to remove the value column and fit your example

# # A tibble: 3 × 3
#   run_set_1 run_set_2 run_set_3
#       <int>     <int>     <int>
# 1         0         6         0
# 2         0         0         6
# 3         6         0         0
  • Related