Home > Software design >  AWK match string using regex and combine with previous string
AWK match string using regex and combine with previous string

Time:07-06

I have been reviewing articles and posts on how to match and compare strings but I am struggling to put the two together, unfortunately, I do not have an example awk command that I am trying to make work because I can't seem to even get that far. Below is what I have been trying to work with, I found it at comparing strings in consecutive lines with awk my hope was that if I changed the match condition from the previous line to instead be anything under 32 id start to get some output I could try to work with, and i modified the NR to start on the 4th string which would be the first subnet mask.

awk '$0<=32 && NR>3 {print (NR)/f} {f=$0} END {print NR,$0}'

My current input looks like this:

hostname1           hostname2           127.0.0.1             27              127.0.0.2              24              127.0.0.3             28              hostname3           127.0.0.4               27              127.0.0.5              24              127.0.0.6            28              127.0.0.7             27              127.0.0.8              24       127.0.0.9             28  

The output I am looking to have would be:

hostname1           hostname2           127.0.0.1/27              127.0.0.2/24              127.0.0.3/28              hostname3           127.0.0.4/27              127.0.0.5/24              127.0.0.6/28              127.0.0.7/27              127.0.0.8/24       127.0.0.9/28          

These are IP addresses and subnet masks, my thinking was to look for 16-32 using a regex, match for the previous string which would always be an IP address, and combine the two. Does anyone have any examples of this being done? I have to use variables as the number of inputted IP addresses and subnet combinations vary

CodePudding user response:

Using GNU or BSD sed for -E to enable EREs:

$ sed -E 's:(\.[0-9] )\t\t([0-9] ):\1/\2:g' file
hostname1               hostname2               127.0.0.1/27            127.0.0.2/24            127.0.0.3/28           hostname3                127.0.0.4/27            127.0.0.5/24            127.0.0.6/28            127.0.0.7/27           127.0.0.8/24             127.0.0.9/28

CodePudding user response:

Using sed

$ sed 's#\(\<[[:digit:].]\ \)[^[:digit:]]*\([[:digit:]]\ \)#\1/\2#g' input_file
hostname1           hostname2           127.0.0.1/27              127.0.0.2/24              127.0.0.3/28              hostname3           127.0.0.4/27              127.0.0.5/24              127.0.0.6/28              127.0.0.7/27              127.0.0.8/24       127.0.0.9/28

\(\<[[:digit:].]\ \) - This is the first capture group as it is enclosed within capturing parenthesis. This capture group will retain digits and periods. There is a word boundary \< at the start of the integer match.

[^[:digit:]]* - Exclude this match as it is not within parenthesis, This will exclude everything up till the next occurance of an integer character.

\([[:digit:]]\ \) - Second capture group which will retain one or more integer characters.

\1/\2 - This is the replacement, as we captured two groups, they can be returned with back refernce \1 and \2 respectively.

The default delimiter / for sed has been changed to # to avoid conflicting with your data which will also contain / after the replacement.

CodePudding user response:

With awk, it's a longer program. This one uses specifically

gawk -i join '{
    n = 0
    delete result
    for (i=1; i<=NF; i  )
        if ($i ~ /^[0-9.] $/ && $(i 1) ~ /^[0-9] /)
            result[  n] = $i "/" $(  i)
        else
            result[  n] = $i
    print join(result, 1, n, "\t")
}' input.file

outputs

hostname1   hostname2   127.0.0.1/27    127.0.0.2/24    127.0.0.3/28    hostname3   127.0.0.4/27    127.0.0.5/24    127.0.0.6/28    127.0.0.7/27    127.0.0.8/24    127.0.0.9/28
  • Related