Hi everyone i have this problem, i want to susbstract the last value of observed in a row with the starting value and divide it by the number of cells with values present for example:
Day1 day2 day3 day4 day5
3 2 1 1 1
3 4 NA NA NA
5 6 7 NA NA
7 8 9 10 12
For: First one the value is (1-3)/5 Second one the value is (4-3)/2 Third one the value is (7-5)/3 Fourth one the value is (12-7)/5
And save all values in a new column
CodePudding user response:
1) Define stat
function and then apply it by row.
library(dplyr)
stat <- function(x) (tail(x, 1) - head(x, 1)) / length(x)
DF %>%
rowwise %>%
mutate(stat = stat(na.omit(c_across()))) %>%
ungroup
giving:
# A tibble: 4 x 6
Day1 day2 day3 day4 day5 stat
<int> <int> <int> <int> <int> <dbl>
1 3 2 1 1 1 -0.4
2 3 4 NA NA NA 0.5
3 5 6 7 NA NA 0.667
4 7 8 9 10 12 1
2) Base R or using base R and stat from above:
cbind(DF, stat = apply(DF, 1, function(x) stat(na.omit(x))))
CodePudding user response:
One way of doing it is by identifying the maximum non-NA index in each row.
Using apply:
dtf = read.table(header = TRUE,
text = ' Day1 day2 day3 day4 day5
3 2 1 1 1
3 4 NA NA NA
5 6 7 NA NA
7 8 9 10 12')
dtf$ratio = apply(dtf, 1, function(x){ind_last = max(which(!is.na(x)
(x[ind_last] - x[1]) / ind_last})
It leads to:
Day1 day2 day3 day4 day5 ratio
1 3 2 1 1 1 -0.4000000
2 3 4 NA NA NA 0.5000000
3 5 6 7 NA NA 0.6666667
4 7 8 9 10 12 1.0000000