I have a data.table
dt
in R like this:
ukey time orderType a edge h1 h2
1: 10600877 93003000 Buy 14.12 100 0.002 0.0002
2: 10600188 93003000 Buy 24.29 100 0.003 0.0003
3: 10600995 93003000 Buy 17.23 100 0.003 0.0003
4: 10600262 93003000 Sell 18.89 100 0.002 0.0004
5: 10600399 93003000 Sell 18.66 100 0.004 0.0002
6: 10603260 93003000 Sell 110.39 100 0.003 0.0002
7: 11000004 93003000 Sell 22.98 100 0.002 0.0003
...
I want to get a new column called check
by some conditions. My sample code looks like this:
for (i in 1:nrow(dt)) {
if (dt[i, a - 1.5 * h1 > 111]) {
dt[, check:= "A"]
break
}
else if (dt[i, a edge - 1.3 * h1- 0.3 * h2> 111]) {
dt[, check:= "B"]
break
}
else if (dt[i, a- 1.5 * h1> 80]) {
dt[, check:= "C"]
break
}
else {dt[, check:= "D"]}
}
However, my code doesn't work well and it is inefficient. Could you give me some better solutions?Many thanks!
CodePudding user response:
You can try to use case_when
in dplyr
package.
dt[, check := case_when((a - 1.5 * h1) > 111 ~ "A",
(a edge - 1.3 * h1- 0.3 * h2)> 111 ~ "B",
(a- 1.5 * h1) > 80 ~ "C",
TRUE ~ "D")]
Result looks like :
> dt
ukey time orderType a edge h1 h2 check
1: 10600877 93003000 Buy 14.12 100 0.002 2e-04 B
2: 10600188 93003000 Buy 24.29 100 0.003 3e-04 B
3: 10600995 93003000 Buy 17.23 100 0.003 3e-04 B
4: 10600262 93003000 Sell 18.89 100 0.002 4e-04 B
5: 10600399 93003000 Sell 18.66 100 0.004 2e-04 B
6: 10603260 93003000 Sell 110.39 100 0.003 2e-04 B
7: 11000004 93003000 Sell 22.98 100 0.002 3e-04 B
Hope it is useful.
CodePudding user response:
What you are looking for is a faster ifelse
equivalent
library(data.table)
setDT(dt)
# initialise
dt[, check := "D"]
# faster ifelse
dt[a- 1.5 * h1> 80, check := "C"
][a edge - 1.3 * h1- 0.3 * h2> 111, check := "B"
][a - 1.5 * h1 > 111, check := "A"
]
Notice the latter subset conditions take precedence. Thus, orders are reversed i.e. C, B, A. Anything not captured by our conditions are left with initialised value i.e. D