I have a data frame with mainly categorical variables. I want to see the number of combinations of variables found in three of these columns with categorical variables. The data in the columns looks like this:
number_arms <- c("6","8","12")
arrangements <- c("single", "paired", "ornament")
approx_position <- c("top", "middle", "bottom")
rg2 <- data.frame(number_arms, arrangements, approx_position)
I was reading in another post to use the following code when comparing two columns:
library(dplyr)
library(stringr)
rg2 %>%
count(combination = str_c(pmin(number_arms, arrangements), ' - ',
pmax(number_arms, arrangements)), name = "count")
This is the result:
combination count
12 - single 1
16 - single 1
4 - paired 3
4 - single 4
5 - paired 4
5 - single 2
6 - ornament 1
6 - paired 81
However, the code does not give me the wanted results if I add the third column, like this:
rg2 %>%
count(combination = str_c(pmin(number_arms, arrangements, approx_position), ' - ',
pmax(number_arms, arrangements, approx_position)), name = "count")
It still runs the code without error but I get wrong results. Do I need a different code to calculate the combinations of three variables?
CodePudding user response:
Tidyverse option (updated to remove group_by
):
library(dplyr)
rg2 %>%
count(number_arms, arrangements, approx_position)
Result:
number_arms arrangements approx_position n
<chr> <chr> <chr> <int>
1 12 ornament bottom 1
2 6 single top 1
3 8 paired middle 1
CodePudding user response:
If you're looking for the count of each combination of the variables, excluding 0, you can do:
subset(data.frame(table(rg2)), Freq > 0)
number_arms arrangements approx_position Freq
1 12 ornament bottom 1
15 8 paired middle 1
26 6 single top 1
or combined:
subset(data.frame(table(rg2)), Freq > 0) |>
tidyr::unite("combn", -Freq, sep = " - ")
combn Freq
1 12 - ornament - bottom 1
15 8 - paired - middle 1
26 6 - single - top 1
data
number_arms <- c("6","8","12")
arrangements <- c("single", "paired", "ornament")
approx_position <- c("top", "middle", "bottom")
rg2 <- data.frame(number_arms, arrangements, approx_position)
CodePudding user response:
You can try dplyr::count()
paste()
:
library(dplyr)
rg2 %>%
count(combination = paste(number_arms, arrangements, approx_position, sep = " - "), name = "count")
# combination count
# 1 12 - ornament - bottom 1
# 2 6 - single - top 1
# 3 8 - paired - middle 1