I would like to sample multiple columns in the 2D matrix by using 'awk'.
For instance,
awk -F " " '{print $900, $925, $950, $975 $1000}' [filename].txt > test.txt
I just wrote five columns in the above command for example. In fact the number of columns would be over 40. The column number has a increment, 25, from starting number, $900.
Writing all $(column number) would be painful.
How could I make the command simpler by using for loop? Or Any other suggestion?
Thank you for reading this question.
CodePudding user response:
I would harness GNU AWK
for this task following way, let file.txt
content be
A B C D E F
AA BB CC DD EE FF
AAA BBB CCC DDD EEE FFF
and say I want to get odd columns starting at 1 that is 1, 3, 5 then
awk 'BEGIN{pitch=2}{for(i=1;i<=NF;i =pitch){printf "%s%s",$i,(i pitch>NF?"\n":" ")}}' file.txt
gives output
A C E
AA CC EE
AAA CCC EEE
Explanation: I do use for
loop with increment by pitch
which is 2
in example, starting is from 1
and condition is i
less equal number of fields (NF
), in each turn of loop I use printf
once, first element is simply value of i
th column ($i
), second is newline (\n
) - if said element is last in given line or space for all other cases. I use number of current column (i
), pitch
and number of columns (NF
) to calcute if this is last value to be included in current line and then so-called ternary operator condition?
valueiftrue:
valueiffalse to select fitting character.
(tested in gawk 4.2.1)
CodePudding user response:
jot -s ' ' -w 'Col-%d' 2000 | mawk '{ print '"$( jot -s ', ' -w '$%d' 40 900 - 25 )"'"" }'
1 Col-900
Col-925
Col-950
Col-975
Col-1000
Col-1025
Col-1050
Col-1075
Col-1100
Col-1125
Col-1150
Col-1175
Col-1200
Col-1225
Col-1250
Col-1275
Col-1300
Col-1325
Col-1350
Col-1375
Col-1400
Col-1425
Col-1450
Col-1475
Col-1500
Col-1525
Col-1550
Col-1575
Col-1600
Col-1625
Col-1650
Col-1675
Col-1700
Col-1725
Col-1750
Col-1775
Col-1800
Col-1825
Col-1850
Col-1875
The trick is to use jot
(or seq
, or something similar) to dynamically generate code with hard-coded column #s :
for the example above, this code is being generated on the fly :
mawk '{
print $900, $925, $950, $975, $1000, $1025, $1050, $1075,
$1100, $1125, $1150, $1175, $1200, $1225, $1250, $1275,
$1300, $1325, $1350, $1375, $1400, $1425, $1450, $1475,
$1500, $1525, $1550, $1575, $1600, $1625, $1650, $1675,
$1700, $1725, $1750, $1775, $1800, $1825, $1850, $1875, "" }'