Home > Software design >  Generate buckets based on a column data then create another column storing values assigned to corres
Generate buckets based on a column data then create another column storing values assigned to corres

Time:03-22

I have a dataframe which includes 2 columns below |Systolic blood pressure |Urea Nitrogen| |------------------------|-------------| |155.86667|50.000000| |140.00000| 20.33333| |135.33333| 33.857143| |126.40000|15.285714| |...|...|

I want to create 2 more columns called Sys_points and BUN_points based on the bucket criteria like the image attached, which will store the values (not in equally spaced) of column Points in the image. I have tried findInterval and cut but can't find functions that allow me to assign values not in sequence order to buckets.

#findInterval
BUN_int <- seq(0,150,by=10)
data3$BUN <- findInterval(data3$`Urea Nitrogen`,BUN_int)

#cut
cut(data3$`Urea Nitrogen`,breaks = BUN_int, right=FALSE, dig.lab=c(0,2,4,6,8,9,11,13,15,17,19,21,23,25,27,28))

Is there any function that can help me with this?

enter image description here

CodePudding user response:

Here’s how to do it using cut(). Note the use of -Inf and Inf to include <x and >=x bins.

bun_data$Sys_points <- cut(
  bun_data$`Systolic blood pressure`,
  breaks = c(5:20 * 10, Inf),
  labels = c(28,26,24,23,21,19,17,15,13,11,9,8,6,4,2,0),
  right = FALSE
)
bun_data$BUN_points <- cut(
  bun_data$`Urea Nitrogen`,
  breaks = c(-Inf, 1:15 * 10, Inf),
  labels = c(0,2,4,6,8,9,11,13,15,17,19,21,23,25,27,28),
  right = FALSE
)

Result:

  Systolic blood pressure Urea Nitrogen Sys_points BUN_points
1                155.8667      50.00000          9          9
2                140.0000      20.33333         11          4
3                135.3333      33.85714         13          6
4                126.4000      15.28571         15          2
  •  Tags:  
  • r
  • Related