My question is best expressed with the below example. Let's start with the below dataframe:
> myData
Name Group Code
1 R 0 0
2 R 0 2
3 T 0 2
4 R 0 0
5 N 1 3
6 N 1 0
7 T 0 4
myData <-
data.frame(
Name = c("R","R","T","R","N","N","T"),
Group = c(0,0,0,0,1,1,0),
Code = c(0,2,2,0,3,0,4)
)
Now, I'd like to add a column, CodeGrp
, whereby if a row's Group
is > 0, then allocate the max Code
for that Group
to all Group
members with the same Group
number so the results look this (note that only one Group
member (where Group
> 0) can have a Code
> 0 and the rest of those Group
members have code 0; maybe there's something easier than my proposed max) can only be :
Name Group Code CodeGrp Explain
1 R 0 0 0 Copy over Code since Group = 0
2 R 0 2 2 Copy over Code since Group = 0
3 T 0 2 2 Copy over Code since Group = 0
4 R 0 0 0 Copy over Code since Group = 0
5 N 1 3 3 Group is > 0 so insert in CodeGrp column the max Code in this Group
6 N 1 0 3 Group is > 0 so insert in CodeGrp column the max Code in this Group
7 T 0 4 4 Copy over Code since Group = 0
Any recommendations for how to do this, in a simple manner, using base R or dplyr?
CodePudding user response:
Here is one approach. You want to group your data by Group. Then, you want to assign values in CodeGrp using a conditional statement. For each group, if Group is 0, assign values in Code. If Group is not 0, assign the max value of the group in CodeGrp.
group_by(myData, Group) %>%
mutate(CodeGrp = if_else(Group == 0, Code, max(Code)))
Name Group Code CodeGrp
<chr> <dbl> <dbl> <dbl>
1 R 0 0 0
2 R 0 2 2
3 T 0 2 2
4 R 0 0 0
5 N 1 3 3
6 N 1 0 3
7 T 0 4 4