Background
Here's a dataset, d
:
d <- data.frame(ID = c("a","a","b","b"),
product_code = c("B78","X31","C12","C12"),
stringsAsFactors=FALSE)
It looks like this:
The Problem and desired output
I'm trying to make an indicator column multiple_products
that's marked 1
for ID
s which have more than one unique product_code
and 0
for those that don't. Here's what I'm looking for:
My attempts haven't worked yet, though.
What I've Tried
Here's my current code:
d <- d %>%
group_by(ID) %>%
mutate(multiple_products = if_else(length(unique(d$product_code)) > 1, 1, 0)) %>%
ungroup()
And this is the result:
Any thoughts?
CodePudding user response:
The d$
should be taken out as this will extract the whole column by removing the group attributes. Also, there is n_distinct
. In addition, there is no need for ifelse
or if_else
as logical values (TRUE/FALSE
) can be directly coerced to 1/0
as these are storage values by either using as.integer
or
library(dplyr)
d %>%
group_by(ID) %>%
mutate(multiple_products = (n_distinct(product_code) > 1)) %>%
ungroup()
-output
# A tibble: 4 x 3
ID product_code multiple_products
<chr> <chr> <int>
1 a B78 1
2 a X31 1
3 b C12 0
4 b C12 0
CodePudding user response:
solution with data.table
;
library(data.table)
setDT(d)
d[,multiple_products:=rleid(product_code),by=ID][
,multiple_products:=ifelse(max(multiple_products)>1,1,0),by=ID]
d
output;
ID product_code multiple_products
<chr> <chr> <int>
1 a B78 1
2 a X31 1
3 b C12 0
4 b C12 0
CodePudding user response:
A base R option using ave
transform(
d,
multiple_products = (ave(match(product_code, unique(product_code)), ID, FUN = var) > 0
)
)
gives
ID product_code multiple_products
1 a B78 1
2 a X31 1
3 b C12 0
4 b C12 0