I am trying to make all my data frames in r have the same levels in a categorical column so that, when I make barplots of them all, they are comparable with some having "unused factors" of frequency 0.
Currently I have multiple, separate data frames including a global data frame, then several broken down by region. Each one has a category column, then a frequency column. I have one "global" data frame with all the categories, but each of the regional data frames only have counts of certain categories found there. For example...
Global DF
category | frequency |
---|---|
red | 2 |
orange | 4 |
yellow | 7 |
green | 1 |
blue | 4 |
purple | 4 |
Current West Region DF
category | frequency |
---|---|
orange | 2 |
blue | 1 |
purple | 3 |
Desired West Region DF
category | frequency |
---|---|
red | 0 |
orange | 2 |
yellow | 0 |
green | 0 |
blue | 1 |
purple | 3 |
This is all based on the original dataset which looks like:
Region | Category |
---|---|
West | orange |
West | orange |
West | blue |
West | purple |
West | purple |
West | purple |
North | red |
North | yellow |
... | ... |
I'm currently using ddply to create the regional DFs, but I can't figure out how to maintain categories of frequency = 0 in each one (as exemplified in the Desired West Regional DF above). Thanks for any insight!
CodePudding user response:
You could convert Category
to a factor
and make the counts using dplyr
's count
which has a .drop
option allowing you to keep empty categories:
I.e.
library(dplyr)
df |>
mutate(Category = as.factor(Category)) |>
count(Region, Category, .drop = FALSE) |>
filter(Region == "West")
Output:
# A tibble: 6 × 3
Region Category n
<chr> <fct> <int>
1 West blue 1
2 West green 0
3 West orange 2
4 West purple 3
5 West red 0
6 West yellow 0
Data:
library(readr)
df <- read_table("Region Category
West orange
West orange
West blue
West purple
West purple
West purple
North red
North yellow
North green")
CodePudding user response:
Using base R
subset(as.data.frame(table(df)), Region == "West")
Region Category Freq
2 West blue 1
4 West green 0
6 West orange 2
8 West purple 3
10 West red 0
12 West yellow 0