Home > Software engineering >  How to take a range of numbers in a variable and create a new column with numerical inputs
How to take a range of numbers in a variable and create a new column with numerical inputs

Time:10-17

Hi I'm sorry if this sounds confusing but I just need some help with R, I'll try to explain here:

so lets say this is my dataset: enter image description here

I want to take ranges from the prestige variable and input 1 - 6 in a new column called "level"

So

  • 0-10 would be 1 in the level column
  • 11-20 would be 2 in the level column
  • 21-30 would be 3
  • 31-40 would be 4
  • 41-50 would be 5
  • 51-60 would be 6

so the new column "level" would have numbers ranging from 1-6 in there

CodePudding user response:

You can do this:

you can change the number of levels with n

library(tidyverse)

tibble(
  prestige = c(0, 22, 5, 55, 30, 2, 44, 21, 3, 19, 60, 59, 29, 37, 46)
) %>% 
  mutate(
    level = cut_number(prestige, n = 6) %>% as.integer()
  )

output

# A tibble: 15 x 2
   prestige level
      <dbl> <int>
 1        0     1
 2       22     3
 3        5     2
 4       55     6
 5       30     4
 6        2     1
 7       44     5
 8       21     3
 9        3     1
10       19     2
11       60     6
12       59     6
13       29     3
14       37     4
15       46     5

CodePudding user response:

Never use pictures of your data. Always use dput(). Here is some sample data:

set.seed(42)
prestige <- sample.int(60, 15, replace=TRUE)
prestige
#  [1] 49 37  1 25 10 36 18 58 49 47 24  7 36 25 37

Now create breaks and use cut:

breaks <- seq(0, 60, 10)    
prestige.grp <- cut(prestige, breaks, include.lowest=TRUE)
table(prestige.grp)
#  [0,10] (10,20] (20,30] (30,40] (40,50] (50,60] 
#       3       1       3       4       3       1 

data.frame(prestige, prestige.grp)
#    prestige prestige.grp
# 1        49      (40,50]
# 2        37      (30,40]
# 3         1       [0,10]
# 4        25      (20,30]
# 5        10       [0,10]
# 6        36      (30,40]
# 7        18      (10,20]
# 8        58      (50,60]
# 9        49      (40,50]
# 10       47      (40,50]
# 11       24      (20,30]
# 12        7       [0,10]
# 13       36      (30,40]
# 14       25      (20,30]
# 15       37      (30,40]

CodePudding user response:

Using dplyr with cut is quite nice

df <- data.frame(prestige = c(0,22,5,55,30,2,44,21,3,19,60,59,29,37,46,49,50,51,60))

df %>% mutate(level= cut(prestige,breaks=c(0, 10, 20, 30,40,50,60), labels=c("1","2","3","4","5","6"),include.lowest=TRUE,right=TRUE))

Output

  prestige level
1         0     1
2        22     3
3         5     1
4        55     6
5        30     3
6         2     1
7        44     5
8        21     3
9         3     1
10       19     2
11       60     6
12       59     6
13       29     3
14       37     4
15       46     5
16       49     5
17       50     5
18       51     6
19       60     6

CodePudding user response:

Using data from @cgvoller

Divide the number by 10 and round it up. Used pmax so that the value 0 gives the level 1.

df$level <- pmax(ceiling(df$prestige/10), 1)
df

#   prestige level
#1         0     1
#2        22     3
#3         5     1
#4        55     6
#5        30     3
#6         2     1
#7        44     5
#8        21     3
#9         3     1
#10       19     2
#11       60     6
#12       59     6
#13       29     3
#14       37     4
#15       46     5
#16       49     5
#17       50     5
#18       51     6
#19       60     6
  •  Tags:  
  • r
  • Related