Home > OS >  Is there a way to "CountIF" in R based on two conditions
Is there a way to "CountIF" in R based on two conditions

Time:12-07

I know how to do this in excel, but am trying to translate into R and create a new column. In R I have a data frame called CleanData. I want to see how many times the value in each row of column A shows up in all of column B. In excel it would read like this:

=COUNTIF(B:B,A2)>0,C="Purple")

The second portion would be a next if / and statement. It would look like this in excel:

=IF(AND(COUNTIF(B:B,A2)>0,C="Purple"),"Yes", "No") 

Anyone know where to start?

I have tried mutating and also this:

sum(CleanData$colA == CleanData$colB) 

and am getting no values

enter image description here

CodePudding user response:

I think this will capture your if/countif scenario:

library(dplyr)
CleanData %>%
  mutate(YesOrNo = case_when(Color != "Purple" ~ "No", is.na(LABEL1) | !nzchar(LABEL1) ~ "No", !LABEL1 %in% LABEL2 ~ "No", TRUE ~ "Yes"))
#    LABEL1   LABEL2  Color YesOrNo
# 1   HELLO     <NA> Purple     Yes
# 2    <NA> HELLO!!!   Blue      No
# 3 HELLO$$     <NA> Purple     Yes
# 4    <NA>    HELLO   Blue      No
# 5 HELLOOO     <NA> Purple     Yes
# 6    <NA>     <NA> Purple      No
# 7    <NA>  HELLOOO   Blue      No
# 8    <NA>  HELLO$$   Blue      No
# 9    <NA>    HELLO Yellow      No

Data

CleanData <- structure(list(LABEL1 = c("HELLO", NA, "HELLO$$", NA, "HELLOOO", NA, NA, NA, NA), LABEL2 = c(NA, "HELLO!!!", NA, "HELLO", NA, NA, "HELLOOO", "HELLO$$", "HELLO"), Color = c("Purple", "Blue", "Purple", "Blue", "Purple", "Purple", "Blue", "Blue", "Yellow")), class = "data.frame", row.names = c(NA, -9L))

or programmatically,

CleanData <- data.frame(LABEL1=c("HELLO",NA,"HELLO$$",NA,"HELLOOO",NA,NA,NA,NA), LABEL2=c(NA,"HELLO!!!",NA,"HELLO",NA,NA,"HELLOOO","HELLO$$","HELLO"),Color=c("Purple","Blue","Purple","Blue","Purple","Purple","Blue","Blue","Yellow"))

CodePudding user response:

You don't need any extra packages, here is a solution with the base R function ifelse, which is a frequently very useful function you should learn. An example:

set.seed(7*11*13)  
DF <- data.frame(cond=rnorm(100), X= sample(c("Yes","No"), 100, replace=TRUE))
with(DF, sum(ifelse( (cond>0)&(X=="Yes"), 1, 0))) 
  • Related