Home > Software engineering >  R: make column with a sequence of letters for all values in another column greater than threshold
R: make column with a sequence of letters for all values in another column greater than threshold

Time:04-25

I have a multi-column dataframe with one of the columns columns called number. I also have a vector called threshold holding a single numerical value.

df <- data.frame(number = c(1,2,3,5,1,2,3,7,3,5,7,3,6,7))
threshold <- 5

The dataframe looks like this:

   number
1       1
2       2
3       3
4       5
5       1
6       2
7       3
8       7
9       3
10      5
11      7
12      3
13      6
14      7

I want to create a new column called passed, with NA in rows in which number < threshold and sequential letters of alphabet in rows with number >= threshold. (sequential - starting with the letter a at the top of the dataframe). It will look like this:

   number passed
1       1   <NA>
2       2   <NA>
3       3   <NA>
4       5      a
5       1   <NA>
6       2   <NA>
7       3   <NA>
8       7      b
9       3   <NA>
10      5      c
11      7      d
12      3   <NA>
13      6      e
14      7      f

I would like to not use a loop here, if possible.

CodePudding user response:

If you want to keep it in base R, and you know that you will have less than 26 (the length of the alphabet) to fill in, you could consider something like this...

df <- data.frame(number = c(1,2,3,5,1,2,3,7,3,5,7,3,6,7))
threshold <- 5
df$number <- df[order(df$number), ]
df$passed <- NA
N <- length(df$passed[df$number>=threshold])
df$passed[df$number>=threshold] <- LETTERS[1:N]

If you want, you can then force this back into the original ordering of rows.

CodePudding user response:

We may use

library(data.table)
setDT(df)[number >= threshold, passed := letters[.I]]
  • Related