Suppose an individual has several entries (rows) in a data frame. For example;
rm(list=ls()); set.seed(1234); n<-3 ;
individualID<-rep(1:3, rep(3,3) )
X<- runif(n*3, 1, 4)
Y<- rep( runif(n,1,4), rep(3,3) )
df1<-round(data.frame(individualID,X,Y),3)
df1
individualID X Y
1 1 3.512 1.656
2 1 1.859 1.656
3 1 1.800 1.656
4 2 1.560 3.432
5 2 1.697 3.432
6 2 1.950 3.432
7 3 1.908 2.577
8 3 1.477 2.577
9 3 1.120 2.577
I would like to manipulate only the first row for every individual as follows such that df1$X will be equal to df1$Y but the rest of the rows remain.
I should end up with
individualID X Y
1 1 1.656 1.656
2 1 1.859 1.656
3 1 1.800 1.656
4 2 3.432 3.432
5 2 1.697 3.432
6 2 1.950 3.432
7 3 2.577 2.577
8 3 1.477 2.577
9 3 1.120 2.577
CodePudding user response:
You may try
library(dplyr)
df1 %>%
group_by(individualID) %>%
mutate(n = 1:n()) %>%
mutate(X = ifelse(n == 1, Y, X)) %>%
select(-n)
individualID X Y
<int> <dbl> <dbl>
1 1 1.66 1.66
2 1 1.86 1.66
3 1 1.8 1.66
4 2 3.43 3.43
5 2 1.70 3.43
6 2 1.95 3.43
7 3 2.58 2.58
8 3 1.48 2.58
9 3 1.12 2.58
CodePudding user response:
You can use row_number()
to number rows by group.
library(dplyr)
df1 %>%
group_by(individualID) %>%
mutate(X = ifelse(row_number() == 1, Y, X)) %>%
ungroup()
I get different values to your df1
for some reason, but the result is:
# A tibble: 9 × 3
individualID X Y
<dbl> <dbl> <dbl>
1 1 2.54 2.54
2 1 2.87 2.54
3 1 2.83 2.54
4 2 3.08 3.08
5 2 3.58 3.08
6 2 2.92 3.08
7 3 2.64 2.64
8 3 1.70 2.64
9 3 3.00 2.64
CodePudding user response:
Another approach would be -
library(dplyr)
df1 %>%
group_by(individualID) %>%
mutate(X = c(Y[1], X[-1])) %>%
ungroup
# individualID X Y
# <int> <dbl> <dbl>
#1 1 1.66 1.66
#2 1 1.86 1.66
#3 1 1.8 1.66
#4 2 3.43 3.43
#5 2 1.70 3.43
#6 2 1.95 3.43
#7 3 2.58 2.58
#8 3 1.48 2.58
#9 3 1.12 2.58