I want to know whether an change of values for column type
after second row and number of changes in column type
after second row. For example in id ==1, type==1e,1e,2d,2h. then when we look after second row type changes from 1e to 2d then 2h. Therefore there is a change and the number of change is 2.
data<- data.frame(id= c(1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 3, 3, 3,3 ,3,3,4,4,4, 5,5),
type=c("1e","1e","2d","2h","1c","1c","1e","2d","2h","2j","1e",
"1e","2e","1e","1e","2h","1c","1c","1c", "1j","1j"))
Desired Output:
id type change_of_type_after_2nd_row count
1 1e NA 2
1 1e NA 2
1 2d yes 2
1 2h yes 2
2 1c NA 4
2 1c NA 4
2 1e yes 4
2 2d yes 4
2 2h yes 4
2 2j yes 4
3 1e NA 3
3 1e NA 3
3 2e yes 3
3 1e yes 3
3 1e No 3
3 2h yes 3
4 1c NA 0
4 1c NA 0
4 1c No 0
5 1j NA 0
5 1j NA 0
Please help?
CodePudding user response:
Grouped by 'id', create a 'new' column with the run-length-id (rleid
) of 'type' and a sequence column ('rn'), then get the 'count' by taking the number of distinct elements of 'new' less than 1 ('count') and 'change_of_type..' with a condition created with case_when
based on the row number and duplicated
values in 'new'
library(dplyr)
library(data.table)
out <- data %>%
group_by(id) %>%
mutate(new = rleid(type), rn = row_number(),
count = n_distinct(new)-1,
change_of_type_after_2nd_row = case_when(rn >2 &
duplicated(new) ~ 'No', rn > 2 ~ 'Yes')) %>%
ungroup %>%
select(-new)
-output
as.data.frame(out)
id type rn count change_of_type_after_2nd_row
1 1 1e 1 2 <NA>
2 1 1e 2 2 <NA>
3 1 2d 3 2 Yes
4 1 2h 4 2 Yes
5 2 1c 1 4 <NA>
6 2 1c 2 4 <NA>
7 2 1e 3 4 Yes
8 2 2d 4 4 Yes
9 2 2h 5 4 Yes
10 2 2j 6 4 Yes
11 3 1e 1 3 <NA>
12 3 1e 2 3 <NA>
13 3 2e 3 3 Yes
14 3 1e 4 3 Yes
15 3 1e 5 3 No
16 3 2h 6 3 Yes
17 4 1c 1 0 <NA>
18 4 1c 2 0 <NA>
19 4 1c 3 0 No
20 5 1j 1 0 <NA>
21 5 1j 2 0 <NA>