For statistical analysis purpose, I would like to regroup some rows inside a data frame based on their values.
What I have:
number | latitude |
---|---|
30 | 57 |
12 | 59 |
01 | 68 |
12 | 66 |
101 | 55 |
47 | 61 |
05 | 60 |
288 | 67 |
The desired output would be, for example, to regroup every latitude above 66 (66 67 68) in a single category 66 and the desired output would be like this:
number | latitude | new |
---|---|---|
30 | 57 | 57 |
12 | 59 | 59 |
01 | 68 | 66 |
12 | 66 | 66 |
101 | 55 | 55 |
47 | 61 | 61 |
05 | 60 | 60 |
288 | 67 | 66 |
I do not want to use an if loop because I feel that it is not really R friendly. I would also like to keep the initial column, that way I can try different combinations later on.
Thank you very much.
CodePudding user response:
Option mutate
and ifelse
:
library(dplyr)
df %>%
mutate(new = ifelse(latitude >= 66, "66 ", latitude))
Output:
number latitude new
1 30 57 57
2 12 59 59
3 01 68 66
4 12 66 66
5 101 55 55
6 47 61 61
7 05 60 60
8 288 67 66
Data
df <- data.frame(number = c("30","12","01","12","101","47","05","288"),
latitude = c(57,59,68,66,55,61,60,67))
CodePudding user response:
library(tidyverse)
tribble(~"number", ~"latitude",
30, 57,
12, 59,
01, 68,
12, 66,
101,55,
47, 61,
05, 60,
288,67) %>%
dplyr::mutate(
new = if_else(latitude > 66,
"66 ",
as.character(latitude)))
CodePudding user response:
We can use
df1$new <- df1$latitude
df1$new[df1$latitude >=66] <- "66 "
or with ifelse
df1$new <- with(df1, ifelse(latitude >=66, "66 ", latitude))
-output
> df1
number latitude new
1 30 57 57
2 12 59 59
3 1 68 66
4 12 66 66
5 101 55 55
6 47 61 61
7 5 60 60
8 288 67 66
Also, as @Mael commented about the type of 'new' column, if we want to preserve the type, can also use pmin
library(dplyr)
df1 %>%
mutate(new = pmin(latitude, 66))
number latitude new
1 30 57 57
2 12 59 59
3 1 68 66
4 12 66 66
5 101 55 55
6 47 61 61
7 5 60 60
8 288 67 66