The variable I am trying to create is called "high".
It will take a value of "1" if totexp (which is already in the dataframe) is above the median of totexp.
It will take a value of "0" if totexp (which is already in the dataframe) is below the median of totexp.
Below is the code I have used to create the variable.
high = rep(0, nrow(df))
high[totexp > median(df$totexp)] = 1
How do I add it to the dataframe?
Thank you!
CodePudding user response:
You can do an ifelse
call:
Data:
df <- data.frame(totexp = c(1,2,3,4,1,2,35,6))
Solution:
df$high <- ifelse(df$totexp > median(df$totexp, na.rm = TRUE),
1,
NA)
Result:
df
totexp high
1 1 NA
2 2 NA
3 3 1
4 4 1
5 1 NA
6 2 NA
7 35 1
8 6 1
CodePudding user response:
Here is a reproducible solution:
# load library
library(tidyverse)
# create some data
dat <- tibble(year = seq(2000, 2020, by = 1)
, totexp = rnorm(n = 21, mean = 1000, sd = 300))
# check the median
median(dat$totexp)
# add our new variable to the dataset
dat$high <- ifelse(dat$totexp > median(dat$totexp), 1, 0)
In R you can add a variable to a dataframe using the <- operator and the dataframe name, e.g. df$high <- your code here.