Home > Enterprise >  Make a column with values borrowed from another column and negative sign from another
Make a column with values borrowed from another column and negative sign from another

Time:05-24

I would like to make a copy of a column in a new column but replace all instances of '0' with '1E 308' but borrow the '-' or ' ' sign from another column. My data frame looks as follows

Gene    log2FoldChange  pvalue
MAN_08191   2.237973215 6.78E-307
MAN_10213   -4.701066739    7.95E-306
MAN_02441   -5.084348578    0
MAN_02858   -4.200656711    5.33E-304
MAN_03363   -2.580385454    0
MAN_01699   3.003051841 4.72E-303
MAN_02586   5.200458256 0

I would like to get the following new column "Rank"

Gene    log2FoldChange  pvalue  Rank
MAN_08191   2.237973215 6.78E-307   6.78E-307
MAN_10213   -4.701066739    7.95E-306   -7.95E-306
MAN_02441   -5.084348578    0   -1E 308
MAN_02858   -4.200656711    5.33E-304   -5.33E-304
MAN_03363   -2.580385454    0   -1E 308
MAN_01699   3.003051841 4.72E-303   4.72E-303
MAN_02586   5.200458256 0   1E 308

CodePudding user response:

You can use sign()

library(dplyr)

dt %>% 
  mutate(rank = sign(log2FoldChange)*(if_else(pvalue==0,1E 308,pvalue)))

Output:


       Gene log2FoldChange    pvalue       rank
1 MAN_08191       2.237973 6.78e-307  6.78e-307
2 MAN_10213      -4.701067 7.95e-306 -7.95e-306
3 MAN_02441      -5.084349  0.00e 00 -1.00e 308
4 MAN_02858      -4.200657 5.33e-304 -5.33e-304
5 MAN_03363      -2.580385  0.00e 00 -1.00e 308
6 MAN_01699       3.003052 4.72e-303  4.72e-303
7 MAN_02586       5.200458  0.00e 00  1.00e 308

CodePudding user response:

If awk is an option

$ awk 'NR == 1 { $( NF 1 ) = "Rank" } NR > 1 {split($2,a,"[0-9]"); ( $NF==0 ? $(NF 1)=a[1]"1E 308" : $( NF 1 ) = $NF) }1' input_file
Gene log2FoldChange pvalue Rank
MAN_08191 2.237973215 6.78E-307 6.78E-307
MAN_10213 -4.701066739 7.95E-306 7.95E-306
MAN_02441 -5.084348578 0 -1E 308
MAN_02858 -4.200656711 5.33E-304 5.33E-304
MAN_03363 -2.580385454 0 -1E 308
MAN_01699 3.003051841 4.72E-303 4.72E-303
MAN_02586 5.200458256 0 1E 308

CodePudding user response:

awk '
    NR==1 {$(NF 1) = "rank"}
    NR!=1 {
      $(NF 1)=($2 ~ /-/ ? "-" : "")($NF == 0 ? "1E 308" : $NF)
    }1' input_file|column -s ' ' -t

Gene       log2FoldChange  pvalue     rank
MAN_08191  2.237973215     6.78E-307  6.78E-307
MAN_10213  -4.701066739    7.95E-306  -7.95E-306
MAN_02441  -5.084348578    0          -1E 308
MAN_02858  -4.200656711    5.33E-304  -5.33E-304
MAN_03363  -2.580385454    0          -1E 308
MAN_01699  3.003051841     4.72E-303  4.72E-303
MAN_02586  5.200458256     0          1E 308
  • Related