I want to extract a value using "awk subtring" which should also count the number of spaces without any separator.
For example, below is the input, and I want to extract the "29611", including space,
201903011232101029 2961104E3021 223 0 12113 5 15 8288 298233 0 45 0 39 4
I used this method, but it used space as a separator:
more abbas.dat | awk '{print substr($1,1,16),substr($1,17,25)}'
Expected output should be :
201903011232101029 2961
But it prints only
201903011232101029
My question is how can we print using "substr" which count spaces?
I know, I can use this command to get the desired output but it is not helpful for my objective
more abbas.dat | awk '{print substr($1,1,16),substr($2,1,5)}'
CodePudding user response:
1st solution: With your shown samples, please try following awk
code. Written and tested in GNU awk
. Using match
function of awk
here to get required output.
To print 1st field followed by varying spaces followed by 5 digits from 2nd field then use following:
awk 'match($0,/^[0-9] [[:space:]] [0-9]{5}/){print substr($0,RSTART,RLENGTH)}' Input_file
OR To print 16 letters in 1st field and 5 from second field including varying length of spaces between 1st and 2nd fields:
awk 'match($0,/^([0-9]{16})[^[:space:]] ([[:space:]] )([0-9]{5})/,arr){print arr[1] arr[2] arr[3]}' Input_file
2nd solution: Using GNU grep
please try following, considering that your 2nd column first 4 needed values can be anything(eg: digits, alphabets etc).
grep -oP '^\S \s .{5}' Input_file
OR to only match 4 digits in 2nd field have a minor change in above grep
.
grep -oP '^\S \s \d{5}' Input_file
CodePudding user response:
I think the simplest way is to include "Fs" in your command.
more abbas.dat | awk -Fs '{print substr($1,1,16),substr($1,17,25)}'
CodePudding user response:
If there is always one space you can use the following command which will print the first group, plus the first 5 character of the second group.
N.B. It's not clear in the question whether you want 4 or 5 characters but that can be adjusted easily.
more abbas.dat | awk '{print $1" "substr($2,1,5) }'
CodePudding user response:
$ awk '{print substr($0,1,24)}' file
201903011232101029 29611
If that's not all you need then edit your question to clarify your requirements.