I have a function that is currently working, but I think there may be a better way for it to work without having to manipulate the data so much beforehand. Basically, I am returning a simple TRUE or FALSE if a value in my column is greater than both the two values before it, and after it.
y1 #a single vector column of values
for (i in 3:length(y1)){ #for every number starting at 3 (because for 2 and 1 you can't go back two)
if(y1[i] > y1[i-1] && y1[i] > y1[i-2] && y1[i] > y1[i 1] && y1[i] > y1[i 2]){ #if the number is greater than 2 before and 2 after...
y2[i] <- 'TRUE' #if it is true, write true. Here y2[i] you're saving the results in the blank vector
} else {
y2[i] <- 'FALSE' } #opposite here
print(y2[i])
This works okay, but as you see I have to start at 3 in my for loop because otherwise I get an error, given that the first and second values, as well as the last two, can't compute the [i-1],[i-2] or [i 1] and [i 2]. If I do for i:length(y1) it will not work and I also have to add two zeros onto the dataset in order to not get an error/be able to "compute" the last TRUE/FALSE value.
Is there any way to clean up the actual function so that I don't have to manipulate the data beforehand? Essentially have the function give me a null just for the first two and last two values in my data?
CodePudding user response:
Another approach is using lag
and lead
from dplyr
:
library(dplyr)
v2 <- (lag(v1, 2) < v1 & lag(v1, 1) < v1 & lead(v1, 1) < v1 & lead(v1, 2) < v1)
Output & data:
v1 <- c(1,2,3,2,1,1,1,1,1,2,1,1)
v2
[1] FALSE FALSE TRUE FALSE FALSE FALSE FALSE FALSE FALSE TRUE FALSE FALSE
CodePudding user response:
my first quick tip would be to look into the lead
and lag
functions of dplyr
.
See for example this tutorial or in the dplyr documentation or in Hadley Wickhams R for Data Science.
Hope this helps!