Category | Value |
---|---|
First | 100 |
Second | 150 |
First | N/A |
Second | 200 |
First | 75 |
Second | N/A |
I would like to count the number of times each value occurs when it has a value. To elaborate, above, both the 'First' and 'Second' column would be equal to 2. I have tried sum(df$Category == 'First', na.rm=TRUE) but am very lost. The csv file I have output is a dataframe I have made from a larger data set. I would like to calculate both First and Second or just one of them, they will be equal because it is 3 different categories that are weeks where the 'N/A' values are weeks where data has not been logged yet.
CodePudding user response:
Assuming "df" is your dataframe having columns as "category" and "value"
df %>%
filter(value !='N/A', !is.na(value)) %>% #filter your data
count(category)
To give you the following data.
category n
1 First 2
2 Second 2
CodePudding user response:
Here are a couple of base R options (as there are already good tidyverse
solutions in the comments):
table(df$Category[!is.na(as.numeric(df$Value))])
#First Second
# 2 2
Or we could use aggregate
:
aggregate(as.numeric(Value) ~ Category, df, NROW, na.action = na.omit)
# Category as.numeric(Value)
#1 First 2
#2 Second 2
Tidyverse
First, we can convert the N/A
values to NA
, then use count
on the non-NA values.
library(tidyverse)
df %>%
mutate(across(everything(), na_if, "N/A")) %>%
count(Category, wt = !is.na(Value))
# Category n
#1 First 2
#2 Second 2
Note: If you are bringing in your data with read.table
or read.csv
, then there is na.strings
option (i.e., na.strings='N/A'
), which would change the N/A
to NA
.
Data.table
library(data.table)
dt <- as.data.table(df)
dt[, sum(Value != "N/A"), by = Category]
Data
df <- structure(list(Category = c("First", "Second", "First", "Second",
"First", "Second"), Value = c("100", "150", "N/A", "200", "75",
"N/A")), class = "data.frame", row.names = c(NA, -6L))