p <- data.frame(x = c("A", "B", "C", "A", "B"),
y = c("A", "B", "D", "A", "B"),
z = c("B", "C", "B", "D", "E"))
p
d <- p %>%
group_by(x) %>%
summarize(occurance1 = count(x),
occurance2 = count(y),
occurance3 = count(z),
total = occurance1 occurance2 occurance3)
d
Output:
A tibble: 3 x 5
x occurance1 occurance2 occurance3 total
<chr> <int> <int> <int> <int>
1 A 2 2 1 5
2 B 2 2 1 5
3 C 1 1 1 3
I have a dataset similar to the one above where I'm trying to get the counts of the different factors in each column. The first one works perfectly, probably because it's grouped by (x), but I've run into various problems with the other two rows. As you can see, it doesn't count "D" at all in y, instead counting it as "C" and z doesn't have an "A" in it, but there's a count of 1 for A. Help?
CodePudding user response:
count
needs data.frame/tibble
as input and not a vector. To make this work, we may need to reshape to 'long' format with pivot_longer
and apply the count
on the columns, and then use adorn_totals
to get the total column
library(dplyr)
library(tidyr)
library(janitor)
p %>%
pivot_longer(cols = everything()) %>%
count(name, value) %>%
pivot_wider(names_from = value, values_from = n, values_fill = 0) %>%
janitor::adorn_totals('col')
-output
name A B C D E Total
x 2 2 1 0 0 5
y 2 2 0 1 0 5
z 0 2 1 1 1 5
CodePudding user response:
In addition to akrun's solution here is one without janitor
using select_if
:
p %>%
pivot_longer(
cols = everything(),
names_to = "name",
values_to = "values"
) %>%
count(name,values) %>%
pivot_wider(names_from = values, values_from = n, values_fill = 0) %>%
ungroup() %>%
mutate(Total = rowSums(select_if(., is.integer), na.rm = TRUE))
name A B C D E Total
<chr> <int> <int> <int> <int> <int> <dbl>
1 x 2 2 1 0 0 5
2 y 2 2 0 1 0 5
3 z 0 2 1 1 1 5