This is the sample dataframe I'm working with:
numbers <- data.frame(value=c(-5000,1,2,100,200,300,400,500, 1000, 2000, 10000, 12000))
I'm looking to create a new column in this data frame called "output" that contains values as follows:
-Same value as in column "value" if value between 1 and 10000
-10000 if the value in column "value" is more than 10000 and
-1 if the value in column "value" is less than 1
Desired output in the new column "output": 1,1,2,100,200,300,400,500, 1000,2000, 10000, 10000.
I would really like to learn how to use for loop, if, else if and else statements to get this output and have tried the following:
for (i in 1:nrow (numbers$value)){
if (numbers$value[i] >10000){
numbers$output <- 10000)
} else if (numbers$value[i] < 1){
numbers$output <- 1)
} else {
numbers$output <- numbers$value)
}
}
Unfortunately, this gives me an error, Error: unexpected '}' in "}"
Appreciate your help in fixing this code!
CodePudding user response:
I see why you are trying to solve this problem with a for loop (I have been there..). In R, there is a useful thing called vectorization. You can use the *apply family to apply a function over an input vector. By this, you give the function an input, and you automatically get an output of the same length.
sapply(numbers$value, function(x){
if (x >10000) return(10000)
else if (x < 1) return(1)
else return(x)
}) -> numbers$output
CodePudding user response:
There are several errors in the original code: Not initializing the output variable, unmatched and unneeded ")", not using subscripts when necessary, and other errors. See the corrected code below.
numbers <- data.frame(value=c(-5000,1,2,100,200,300,400,500, 1000, 2000, 10000, 12000))
numbers$output<-NA
for (i in 1:nrow(numbers)){
if (numbers$value[i] >10000){
numbers$output[i] <- 10000
} else if (numbers$value[i] < 1){
numbers$output[i] <- 1
} else {
numbers$output[i] <- numbers$value[i]
}
}
numbers
Here is a solution more straight forward using the case_when
function from the dplyr package:
numbers <- data.frame(value=c(-5000,1,2,100,200,300,400,500, 1000, 2000, 10000, 12000))
library(dplyr)
numbers$output <-case_when(
numbers$value >10000 ~ 10000,
numbers$value < 1 ~ 1,
TRUE ~ numbers$value. #default case
)
numbers
CodePudding user response:
in the spirit of ifelse-ness you could also use the ifelse()
function.
numbers$output <- ifelse(numbers$value > 1000, 1000, ifelse(numbers$value < 0, 1, numbers$value))