I am currently struggling with a median split in R studio. I wish to create a new column in my data frame which is a median split of another, however, I do not know how this can be accomplished. Any and all help will be appreciated. this is the code I have previously run:
medianpcr <- median(honourswork$PCR.x)
highmedian <- filter(honourswork, PCR.x <= medianpcr)
lowmedian <- filter(honourswork, PCR.x > medianpcr)
CodePudding user response:
Let's first create some data:
set.seed(123)
honourswork <- data.frame(PCR.x = rnorm(100))
In dplyr, you might do:
library(tidyverse)
honourswork %>% mutate(medianpcr = median(PCR.x)) %>% filter(PCR.x > medianpcr) %>% select(PCR.x) -> highmedian
honourswork %>% mutate(medianpcr = median(PCR.x)) %>% filter(PCR.x <= medianpcr) %>% select(PCR.x) -> lowmedian
Equivalently in base R:
honourswork[honourswork$PCR.x > median(honourswork$PCR.x),] -> highmedian
honourswork[honourswork$PCR.x <= median(honourswork$PCR.x),] -> lowmedian
CodePudding user response:
When you post a question on SO, it's always a good idea to include an example dataframe so that the answerer doesn't have to create one themselves.
Onto your question, if I understand you correctly, you can use the mutate()
and case_when()
from the dplyr
package:
# Load the dplyr library
library(dplyr)
# Create an example dataframe
data <- data.frame(
rowID = c(1:20),
value = runif(20, 0, 50)
)
# Use case_when to mutate a new column 'category' with values based on
# the 'value' column
data2 <- data %>%
dplyr::mutate(category =
dplyr::case_when(
value > median(value) ~ "Highmedian",
value < median(value) ~ "Lowmedian",
value == median(value) ~ "Median"
)
)
More about case_when() here.
Hope this helps!