I'm looking for a way to return the counts of a range of integers in a list even if the integer doesn't appear in the list. I've normally used the table()
function for tasks like this, but in this case it skips over unobserved integers in the list. I am looking for something that fills those instances with zeros. For example:
my_vector <- c(4, 0, 3, 3, 6, 2, 4, 0)
table(my_vector)
gives the following output:
0 2 3 4 6
2 1 2 2 1
I'm wondering if there is a simple way to get an output like this:
0 1 2 3 4 5 6
2 0 1 2 2 0 1
Looking at the documentation for table()
, I tried adding the useNA="always"
argument but that just appends a count for the number of missing values in the list to the output. I also tried adding a row.names
argument with explicit labels (for example, including a "1" label and a "5" label in the case above), but it seems that argument is only used for as.data.frame()
, which I reckon is built on the table()
function.
CodePudding user response:
You can use tabulate()
, assuming the vector doesn't contain any negative values. To begin at zero we add one to both arguments:
tabulate(my_vector 1, nbins = max(my_vector) 1)
[1] 2 0 1 2 2 0 1
With names:
setNames(tabulate(my_vector 1, nbins = max(my_vector) 1), 0:max(my_vector))
0 1 2 3 4 5 6
2 0 1 2 2 0 1
Or less efficiently with table()
by converting the vector to a factor and setting the levels to cover the range of the vector from zero:
table(factor(my_vector, levels = 0:max(my_vector)))
0 1 2 3 4 5 6
2 0 1 2 2 0 1
And for completeness, a function that works with negative values tabulating from/to zero:
tab <- function(x) {
offset <- abs(min(c(0, x))) 1
rnge <- diff(range(c(0, x))) 1
setNames(tabulate(x offset, nbins = rnge), seq(rnge) - offset)
}
tab(c(-4, -4, -3))
-4 -3 -2 -1 0
2 1 0 0 0
tab(c(-2, 2))
-2 -1 0 1 2
1 0 0 0 1
tab(c(4, 4, 3))
0 1 2 3 4
0 0 0 1 2