I have a file that look like this:
#[1]CHROM [2]POS [3]REF [4]ALT [5]GTEX-1117F_GTEX-1117F [6]GTEX-111CU_GTEX-111CU [7]GTEX-111FC_GTEX-111FC [8]GTEX-111VG_GTEX-111VG [9]GTEX-111YS_GTEX-111YS [10]GTEX-ZZPU_GTEX-ZZPU
22 20012563 T C 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0
I want to convert it to look like this:
#[1]CHROM [2]POS [3]REF [4]ALT [5]GTEX-1117F_GTEX-1117F [6]GTEX-111CU_GTEX-111CU [7]GTEX-111FC_GTEX-111FC [8]GTEX-111VG_GTEX-111VG [9]GTEX-111YS_GTEX-111YS [10]GTEX-ZZPU_GTEX-ZZPU
22 20012563 T C 0 0 0 0 0 0 0 0 0 0 0
I basically want to convert the 0.0 or 1.0 or 2.0 to 0,1,2 I tried to use this command but it doesn't give me the correct output:
cat dosage.txt | "%d\n" "$2" 2>/dev/null
Does anyone know how to do this using awk or sed command. Thank you.
CodePudding user response:
how to convert floating number to integer in linux(...)using awk
You might use int
function of GNU AWK
, consider following simple example, let file.csv
content be
name,x,y,z
A,1.0,2.1,3.5
B,4.7,5.9,7.0
then
awk 'BEGIN{FS=OFS=","}NR==1{print;next}{for(i=2;i<=NF;i =1){$i=int($i)};print}' file.csv
gives output
name,x,y,z
A,1,2,3
B,4,5,7
Explanation: I inform GNU AWK
that ,
is both field separator (FS
) and output field separator (OFS
). I print
first row as-is and instruct GNU AWK
to go to next line, i.e. do nothing else for that line. For all but first line I use for
loop to apply int
to fields from 2nd to last, after that is done I print
such altered line.
(tested in GNU Awk 5.0.1)
CodePudding user response:
Maybe this will help. This regex saves the whole part in a variable, and removes the rest. regex can often be fooled by unexpected input, so make sure that you test this against all forms of input data. as I did (partially) for this example.
# echo 1234.5 345 546.0 234. hi | sed 's/\([0-9]*\)\.[0-9]*/\1/gm'
outputs
1234 345 546 234 hi