structure(list(id = c(1L, 1L, 2L, 3L, 3L, 4L), hire_year = c(2017L,
2017L, 2017L, 2017L, 2016L, 2016L)), class = "data.frame", row.names = c(NA,
-6L))
id hire_year
1 1 2017
2 1 2017
3 2 2017
4 3 2017
5 3 2016
6 4 2016
**Expected output**
id hire_year dummy
1 1 2017 0
2 1 2017 0
3 2 2017 1
4 3 2017 0
5 3 2016 0
6 4 2016 1
How to create dummy that equals 1 (and 0 otherwise) if an id appears only once?
CodePudding user response:
With tidyverse
, we can group by the id
, then use the number of observations within an ifelse
statement.
library(tidyverse)
df %>%
group_by(id) %>%
mutate(dummy = ifelse(n() == 1, 1, 0))
Or we could add the number of observations, then change the value based on the condition.
df %>%
add_count(id, name = "dummy") %>%
mutate(n = ifelse(n == 1, 1, 0))
Output
id hire_year dummy
1 1 2017 0
2 1 2017 0
3 2 2017 1
4 3 2017 0
5 3 2016 0
6 4 2016 1
CodePudding user response:
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
structure(list(id = c(1L, 1L, 2L, 3L, 3L, 4L), hire_year = c(2017L,
2017L, 2017L, 2017L, 2016L, 2016L)), class = "data.frame", row.names = c(NA,
-6L)
) %>%
add_count(id, name = 'dummy') %>%
mutate(
dummy = as.integer(dummy == 1)
)
#> id hire_year dummy
#> 1 1 2017 0
#> 2 1 2017 0
#> 3 2 2017 1
#> 4 3 2017 0
#> 5 3 2016 0
#> 6 4 2016 1
Created on 2022-03-04 by the reprex package (v2.0.0)
CodePudding user response:
We can use ave
in base R like below
> transform(df, dummy = (ave(id, id, FUN = length) == 1))
id hire_year dummy
1 1 2017 0
2 1 2017 0
3 2 2017 1
4 3 2017 0
5 3 2016 0
6 4 2016 1
CodePudding user response:
A data.table solution:
library(data.table)
DT <- structure(list(id = c(1L, 1L, 2L, 3L, 3L, 4L), hire_year = c(2017L,
2017L, 2017L, 2017L, 2016L, 2016L)), class = "data.frame", row.names = c(NA,
-6L))
# Convert into data.table
setDT(DT)
# Count number of times "id" shows up
DT[, count := .N, by =.(id)]
# Create a dummy variable that equals 1 if count ==1
DT[, dummy := fifelse(count == 1,1,0)]
id hire_year count dummy
<int> <int> <int> <num>
1: 1 2017 2 0
2: 1 2017 2 0
3: 2 2017 1 1
4: 3 2017 2 0
5: 3 2016 2 0
6: 4 2016 1 1