Home > OS >  Rename factor levels according to unique scheme
Rename factor levels according to unique scheme

Time:10-13

I try to code my factorial levels as numbers in my dataframe. Each column has always two factor levels which are a combination of those: “AA” “GG” “CC” “TT”

x1x x2x x3x x4x x5x x6x
1 TT GG AA GG AA GG
2 CC AA CC GG TT CC
3 CC GG CC GG AA GG
4 TT AA CC GG TT CC
5 CC AA CC TT AA GG
6 TT GG AA TT AA CC

I need to rewrite them into 2 and 0. This would be easy if for example “AA” would always be together with “GG” and “CC” with “TT”, then I could simply give all “AA” and “CC” a 2, and the other ones a 0. However, every factor level can be in combination with another one, but I always need a 2 and 0 coding system for each column, but for “AA CC” this would give only 2s.

I still would be able to do this for each single column individually, but the original dataframe is very big and I cannot do it for each column individually, so I guess a loop is needed and “if” statement somewhere, where

“if” a column has “AA CC” combination, then give each cell with “AA” a 2, and each “CC” cell a 0 in this column. But “if” a column has “CC TT” combination, then give each cell with “CC” a 2, and each “TT” cell a 0.

Then we would have the following result:

x1x x2x x3x x4x x5x x6x
1 0 0 2 2 2 0
2 2 2 0 2 0 2
3 2 0 0 2 2 0
4 0 2 0 2 0 2
5 2 2 0 2 2 0
6 0 0 2 0 2 2

With this scheme (any is possible):

Pic

Sorry for the super specific question, I swear it makes sense in what I am trying to do. I am open to any method/package etc. Thank you for any help!

CodePudding user response:

The default ordering of factor levels is alphabetical, and luckily you always want the alphabetically first string to be 2, and the alphabetically last to be 0. Thus we can use:

result = input
result[] = lapply(result, factor, labels = c("2", "0"))

This will keep them as a factor, if you instead want them to be numeric we could modify to

result = input
result[] = lapply(result, \(x) 
  factor(x, labels = c("2", "0")) |> as.character() |> as.integer()
)
  • Related