Home > front end >  Binning a discrete variable (preferably in dplyr)
Binning a discrete variable (preferably in dplyr)

Time:09-24

I would like to "bin" a large discrete variable by combining two consecutive rows into one bin. I would also like to call the bin by the first row value.

As an example:

x<-data.frame(x=c(1,2,3,4,5,6,7,8,9,10,11,12),
              y=c(1,1,3,3,5,5,7,7,9,9,11,11))
x

CodePudding user response:

Performing the steps as you exactly outlined them would be this:

library(dplyr)

x %>%
  mutate(bins = rep(1:(length(x) / 2), each = 2)) %>%
  group_by(bins) %>%
  filter(row_number() == 1) %>%
  ungroup()

However this would give you the exact same result (without the bins column) in one line of code:

x[seq(1, nrow(x), by = 2), ]

CodePudding user response:

We may use gl to create the grouping bin

library(dplyr)
x %>%
   mutate(grp = as.integer(gl(n(), 2, n())))
    x  y grp
1   1  1   1
2   2  1   1
3   3  3   2
4   4  3   2
5   5  5   3
6   6  5   3
7   7  7   4
8   8  7   4
9   9  9   5
10 10  9   5
11 11 11   6
12 12 11   6
  • Related