how to convert floating number to integer in linux-CodePudding

I have a file that look like this:

#[1]CHROM       [2]POS  [3]REF  [4]ALT  [5]GTEX-1117F_GTEX-1117F        [6]GTEX-111CU_GTEX-111CU        [7]GTEX-111FC_GTEX-111FC        [8]GTEX-111VG_GTEX-111VG        [9]GTEX-111YS_GTEX-111YS  [10]GTEX-ZZPU_GTEX-ZZPU

22      20012563        T       C       0.0     0.0     0.0     0.0     0.0     0.0     0.0     0.0     0.0     0.0     0.0     0.0     0.0     0.0     0.0
         0.0     0.0     0.0     0.0

I want to convert it to look like this:

#[1]CHROM       [2]POS  [3]REF  [4]ALT  [5]GTEX-1117F_GTEX-1117F        [6]GTEX-111CU_GTEX-111CU        [7]GTEX-111FC_GTEX-111FC        [8]GTEX-111VG_GTEX-111VG        [9]GTEX-111YS_GTEX-111YS  [10]GTEX-ZZPU_GTEX-ZZPU
    
22      20012563        T       C       0    0     0     0     0     0     0     0     0     0     0

I basically want to convert the 0.0 or 1.0 or 2.0 to 0,1,2 I tried to use this command but it doesn't give me the correct output:

cat dosage.txt | "%d\n" "$2" 2>/dev/null

Does anyone know how to do this using awk or sed command. Thank you.

CodePudding user response：

how to convert floating number to integer in linux(...)using awk

You might use int function of GNU AWK, consider following simple example, let file.csv content be

name,x,y,z
A,1.0,2.1,3.5
B,4.7,5.9,7.0

then

awk 'BEGIN{FS=OFS=","}NR==1{print;next}{for(i=2;i<=NF;i =1){$i=int($i)};print}' file.csv

gives output

name,x,y,z
A,1,2,3
B,4,5,7

Explanation: I inform GNU AWK that , is both field separator (FS) and output field separator (OFS). I print first row as-is and instruct GNU AWK to go to next line, i.e. do nothing else for that line. For all but first line I use for loop to apply int to fields from 2nd to last, after that is done I print such altered line.

(tested in GNU Awk 5.0.1)

CodePudding user response：

Maybe this will help. This regex saves the whole part in a variable, and removes the rest. regex can often be fooled by unexpected input, so make sure that you test this against all forms of input data. as I did (partially) for this example.

# echo 1234.5  345  546.0 234. hi | sed 's/\([0-9]*\)\.[0-9]*/\1/gm'

outputs

1234 345 546 234 hi