I am coming to R from python panda code.
In my R dataframe I am trying to calculate column as follows:
s1$change_max1[s1['change_abs'] >0 ]= s1['high']-s1['close'] - s1['change_abs']
s1$change_max1[s1['change_abs']< 0 ]= s1['low'] -s1['close'] - s1['change_abs']
s1$change_max1[s1['change_abs']==0 ]= s1['change_abs']
It's not working as expected, similar code in pandas work as expected.
s.loc[s1['change_abs']>0 ,'change_max']= s1['high']-(s1['close'] - s1['change_abs'])
s.loc[s1['change_abs']<0 ,'change_max']= s1['low'] -(s1['close'] - s1['change_abs'])
s.loc[s1['change_abs']==0,'change_max']= s1['change_abs']
it looks to me that on the right side in first snippet I can not use entire data frame, while left side is only a subset.
What am I missing?
Thanks
CodePudding user response:
The reason could be related to the length difference in the rhs and lhs i.e. rhs expression is full column length whereas lhs length is just a subset.
s1$change_max1 <- NA_real_
i1 <- sign(s1$change_abs)
s1$change_max1[i1 %in% 1 ] <- with(s1, high - close - change_abs)[i1 %in% 1]
s1$change_max1[i1 %in% -1 ]<- with(s1, low - close - change_abs)[i1 %in% -1]
s1$change_max1[i1 %in% 0 ] <- 0
Or may use a nested ifelse
or case_when
library(dplyr)
s1 <- s1 %>%
mutate(change_max1 = case_when(
change_abs > 0 ~ high -(close - change_abs),
change_abs < 0~ low -(close - change_abs),
change_abs == 0 ~ 0))
CodePudding user response:
I believe that the code below answers the question. Untested, since there is no data.
i_zero <- which(s1$change_abs == 0)
s1$change_max1[i_zero] <- s1$change_abs[i_zero]
s1$change_max1[-i_zero] <- abs(s1$high - s1$close)[-i_zero] - s1$change_abs[-i_zero]