Hi I'm sorry if this sounds confusing but I just need some help with R, I'll try to explain here:
so lets say this is my dataset:
I want to take ranges from the prestige variable and input 1 - 6 in a new column called "level"
So
- 0-10 would be 1 in the level column
- 11-20 would be 2 in the level column
- 21-30 would be 3
- 31-40 would be 4
- 41-50 would be 5
- 51-60 would be 6
so the new column "level" would have numbers ranging from 1-6 in there
CodePudding user response:
You can do this:
you can change the number of levels with n
library(tidyverse)
tibble(
prestige = c(0, 22, 5, 55, 30, 2, 44, 21, 3, 19, 60, 59, 29, 37, 46)
) %>%
mutate(
level = cut_number(prestige, n = 6) %>% as.integer()
)
output
# A tibble: 15 x 2
prestige level
<dbl> <int>
1 0 1
2 22 3
3 5 2
4 55 6
5 30 4
6 2 1
7 44 5
8 21 3
9 3 1
10 19 2
11 60 6
12 59 6
13 29 3
14 37 4
15 46 5
CodePudding user response:
Never use pictures of your data. Always use dput()
. Here is some sample data:
set.seed(42)
prestige <- sample.int(60, 15, replace=TRUE)
prestige
# [1] 49 37 1 25 10 36 18 58 49 47 24 7 36 25 37
Now create breaks and use cut
:
breaks <- seq(0, 60, 10)
prestige.grp <- cut(prestige, breaks, include.lowest=TRUE)
table(prestige.grp)
# [0,10] (10,20] (20,30] (30,40] (40,50] (50,60]
# 3 1 3 4 3 1
data.frame(prestige, prestige.grp)
# prestige prestige.grp
# 1 49 (40,50]
# 2 37 (30,40]
# 3 1 [0,10]
# 4 25 (20,30]
# 5 10 [0,10]
# 6 36 (30,40]
# 7 18 (10,20]
# 8 58 (50,60]
# 9 49 (40,50]
# 10 47 (40,50]
# 11 24 (20,30]
# 12 7 [0,10]
# 13 36 (30,40]
# 14 25 (20,30]
# 15 37 (30,40]
CodePudding user response:
Using dplyr
with cut
is quite nice
df <- data.frame(prestige = c(0,22,5,55,30,2,44,21,3,19,60,59,29,37,46,49,50,51,60))
df %>% mutate(level= cut(prestige,breaks=c(0, 10, 20, 30,40,50,60), labels=c("1","2","3","4","5","6"),include.lowest=TRUE,right=TRUE))
Output
prestige level
1 0 1
2 22 3
3 5 1
4 55 6
5 30 3
6 2 1
7 44 5
8 21 3
9 3 1
10 19 2
11 60 6
12 59 6
13 29 3
14 37 4
15 46 5
16 49 5
17 50 5
18 51 6
19 60 6
CodePudding user response:
Using data from @cgvoller
Divide the number by 10 and round it up. Used pmax
so that the value 0 gives the level 1.
df$level <- pmax(ceiling(df$prestige/10), 1)
df
# prestige level
#1 0 1
#2 22 3
#3 5 1
#4 55 6
#5 30 3
#6 2 1
#7 44 5
#8 21 3
#9 3 1
#10 19 2
#11 60 6
#12 59 6
#13 29 3
#14 37 4
#15 46 5
#16 49 5
#17 50 5
#18 51 6
#19 60 6