I'm struggling on how can I split my dataframe in 2 or more parts. I have a lot of columns and rows, but imagine a toy example:
test = data.frame(car = c("A", "A", "B", "C", "D", "E", "B", "C", "D"), value = c(5,4,3,5, 6, 6, 7 ,8 ,10))
#result
# car value group
#1 A 5 1
#2 A 4 1
#3 B 3 2
#4 C 5 1
#5 D 6 2
#6 E 6 2
#7 B 7 2
#8 C 8 1
#9 D 10 2
The only restriction that I need is:
The same car cannot be part of the same category, i.e., the same car, for example car A
, it will appear in several lines of my real dataframe. Every time it occurs, it must have the same corresponding category, for example group = 1
. The same group will have several different cars, but the same car can never be in different groups.
Any hint? I tried test %>% mutate(group = ntile(car, 4))
without success.
CodePudding user response:
gr <- function(df, groups){
g <- as.integer(factor(df[[1]])) %% groups
df$groups <- as.integer(factor(g))
df
}
gr(test, 1)
car value groups
1 A 5 1
2 A 4 1
3 B 3 1
4 C 5 1
5 D 6 1
gr(test, 2)
car value groups
1 A 5 2
2 A 4 2
3 B 3 1
4 C 5 2
5 D 6 1
gr(test, 3)
car value groups
1 A 5 2
2 A 4 2
3 B 3 3
4 C 5 1
5 D 6 2
gr(test, 4)
car value groups
1 A 5 2
2 A 4 2
3 B 3 3
4 C 5 4
5 D 6 1
CodePudding user response:
Using a dplyr
approach:
library(dplyr)
test = data.frame(car = c("A", "A", "B", "C", "D", "E", "B", "C", "D"), value = c(5,4,3,5, 6, 6, 7 ,8 ,10))
test %>%
mutate(group = 1 match(car,car) %% 4)
#> car value group
#> 1 A 5 2
#> 2 A 4 2
#> 3 B 3 4
#> 4 C 5 1
#> 5 D 6 2
#> 6 E 6 3
#> 7 B 7 4
#> 8 C 8 1
#> 9 D 10 2