Home > Enterprise >  Count number of filtered values and create new columns
Count number of filtered values and create new columns

Time:12-01

I have a data frame called 'data' that contains multiple columns:

Grade EMPID PayBand
A 12345 15001-20000
c 64859 30001-35000
A 61245 20001-25000
D 75134 45001-50000
D 78451 40001-45000
C 31645 30001-35000
A 62513 20001-25000
D 91843 25001-30000
D 91648 35001-40000

I need R code to create a data frame that counts the number of each Grade within each PayBand that looks like this. E.g:

PayBand A C D
15001-20000 1 0 0
20001-25000 2 0 0
25001-30000 0 0 1
30001-35000 0 2 0
35001-40000 0 0 1
40001-45000 0 0 1
45001-50000 0 0 1

I am unsure how to create the new dataframe and the new columns that are based off the first dataframe. Any help is much appreciated.

CodePudding user response:

With table:

table(df[c(3, 1)])

Or with pivot_wider:

tidyr::pivot_wider(df, -EMPID, 
                   names_from = "Grade", values_from = "Grade", 
                   values_fn = length, values_fill = 0)

#   PayBand         A     C     D
# 1 15001-20000     1     0     0
# 2 30001-35000     0     2     0
# 3 20001-25000     2     0     0
# 4 45001-50000     0     0     1
# 5 40001-45000     0     0     1
# 6 25001-30000     0     0     1
# 7 35001-40000     0     0     1

CodePudding user response:

Good simple answer from Maël. Here are some related alternatives, including some options for conversion of the result to a data frame. atm1984, please make it easier next time with a reproducible example.

> df <- data.frame(Grade = c('A', 'C', 'A', 'D'),
                   PayBand = c('15001-20000', '30001-35000', '20001-25000', '450
01-50000'))

> table(df$PayBand, df$Grade)

              A C D
  15001-20000 1 0 0
  20001-25000 1 0 0
  30001-35000 0 1 0
  45001-50000 0 0 1

> as.data.frame(table(df$PayBand, df$Grade))
          Var1 Var2 Freq
1  15001-20000    A    1
2  20001-25000    A    1
3  30001-35000    A    0
4  45001-50000    A    0
5  15001-20000    C    0
6  20001-25000    C    0
7  30001-35000    C    1
8  45001-50000    C    0
9  15001-20000    D    0
10 20001-25000    D    0
11 30001-35000    D    0
12 45001-50000    D    1

> as.data.frame.matrix(table(df$PayBand, df$Grade))
            A C D
15001-20000 1 0 0
20001-25000 1 0 0
30001-35000 0 1 0
45001-50000 0 0 1
  • Related