I am trying to find the closest number to zero amongst negative numbers only. When all columns accident1:accident3 are positive, return NA. When all number are negative, return max of the negative numbers in accident1:accident3. When both positive and negative numbers are present, only return the largest number below zero (including zero).
Data:
df <- data.frame(id=1:4, accident1=c(-1,-1,3, NA), accident2=c(-5,100, 2, NA), accident3=c(-4,-3,1,NA))
> df
id accident1 accident2 accident3
1 1 -1 -5 -4
2 2 -1 100 -3
3 3 3 2 1
4 4 NA NA NA
Attempt:
df %>%
rowwise() %>%
mutate(magic=
case_when(
accident1 < 0 & accident2 < 0 & accident3 < 0 ~ as.numeric(pmax(accident1, accident2, accident3, na.rm=T)),
accident1 > 0 & accident2 > 0 & accident3 > 0 ~ NA_real_,
(accident1 >0 |accident2<0 |accident3<0) & (accident1 >0 |accident2>0 | accident3>0) ~
# need max for cell <0
as.numeric(pmax(accident1, accident2, accident3, na.rm=T)), TRUE~NA_real_))
Result:
id accident1 accident2 accident3 magic
<int> <dbl> <dbl> <dbl> <dbl>
1 1 -1 -5 -4 -1
2 2 -1 100 -3 100
3 3 3 2 1 NA
4 4 NA NA NA NA
Desired:
id accident1 accident2 accident3 magic
<int> <dbl> <dbl> <dbl> <dbl>
1 1 -1 -5 -4 -1
2 2 -1 100 -3 -1
3 3 3 2 1 NA
4 4 NA NA NA NA
CodePudding user response:
If you convert the data frame to long format you can do this a little bit more compactly. (It also generalizes to any number of accident types, missing accident types within id
, etc. ...)
sfun <- function(x) {
x <- na.omit(x)
## if x has no non-NA values, all(x>0) will be TRUE
if (all(x>0)) NA_real_ else max(x[x<=0])
}
(df
## convert to long format
%>% pivot_longer(-id)
%>% group_by(id)
## apply summary function to values within id
%>% summarise(magic=sfun(value))
## add original columns back in
%>% full_join(df, by = "id")
)
The only difference is that the magic
column is before the rest of the data, not after it (you could add a call to relocate()
if you like)
CodePudding user response:
Using sfun
function from @Ben Bolker's answer you can also do this keeping the data in wide format. Using rowwise
and c_across
.
library(dplyr)
sfun <- function(x) {
x <- na.omit(x)
if (all(x>0)) NA_real_ else max(x[x<=0])
}
df %>%
rowwise() %>%
mutate(magic = sfun(c_across(starts_with('accident')))) %>%
ungroup
# id accident1 accident2 accident3 magic
# <int> <dbl> <dbl> <dbl> <dbl>
#1 1 -1 -5 -4 -1
#2 2 -1 100 -3 -1
#3 3 3 2 1 NA
#4 4 NA NA NA NA