How to replace newlines between brackets-CodePudding

I have log file similar to this format

Here is the echo command to produce that output

$ echo -e "test {\nseq-cont {\n\t\t\t0,\n\t\t\t67,\n\t\t\t266\n\t\t\t},\n\t\tgrp-id 505\n\t}\n}\ntest{\n\ttest1{\n\t\tval\n\t}\n}\n"

Question is how to remove all whitespace between seq-cont { and the next } that may be multiple in the file.

I want the output to be like this. Preferably use sed to produce the output.

test{seq-cont{0,67,266},
                       grp-id 505
        }
}
test{
        test1{
                val
        }
}

Efforts by OP: Here is the one somewhat worked but not exactly what I wanted:

sed ':a;N;/{/s/[[:space:]]\ //;/}/s/}/}/;ta;P;D' logfile

CodePudding user response：

It can be done using gnu-awk with a custom RS regex that matches { and closing }:

awk -v RS='{[^}] }' 'NR==1 {gsub(/[[:space:]] /, "", RT)} {ORS=RT} 1' file

test {seq-cont{0,67,266},
                grp-id 505
        }
}
test{
        test1{
                val
        }
}

Here:

NR==1 {gsub(/[[:space:]] /, "", RT)}: For the first record replace all whitespaces (including line breaks) with empty string.
{ORS=RT}: Set ORS to whatever text we captured in RS

PS: Remove NR==1 if you want to do this for entire file.

CodePudding user response：

With your shown samples, please try following awk program. Tested and written in GNU awk.

awk -v RS= '
match($0,/{\nseq-cont {\n[^}]*/){
  val=substr($0,RSTART,RLENGTH)
  gsub(/[[:space:]] /,"",val)
  print substr($0,1,RSTART-1) val substr($0,RSTART RLENGTH)
}
'  Input_file

Explanation: Simple explanation would be, using RS capability to set it to null. Then using match function of awk to match everything between seq-cont { to till next occurrence of }. Removing all spaces, new lines in matched value. Finally printing all the values including newly edited values to get expected output mentioned by OP.

CodePudding user response：

You can do that much easier with perl:

perl -0777 -i -pe 's/\s (seq-cont\s*\{[^}]*\})/$1=~s|\s ||gr/ge' logfilepath

The -0777 option tells perl to slurp the file into a single string, -i saves changes inline, \s (seq-cont\s*\{[^}]*\}) regex matches one or more whitespaces, then captures into Group 1 ($1) seq-cont, zero or more whitespaces, and then a substring between the leftmost { and the next } char ([^}]* matches zero or more chars other than }) and then all one or more whitespace character chunks (matched with \s ) are removed from the whole Group 1 value ($1) (this second inner replacement is enabled with e flag). All occurrences are handled due to the g flag (next to e).

See the online demo:

#!/bin/bash
s=$(echo -e "test {\nseq-cont {\n\t\t\t0,\n\t\t\t67,\n\t\t\t266\n\t\t\t},\n\t\tgrp-id 505\n\t}\n}\ntest{\n\ttest1{\n\t\tval\n\t}\n}\n")
perl -0777 -pe 's/\s (seq-cont\s*\{[^}]*\})/$1=~s|\s ||gr/ge' <<< "$s"

Output:

test {seq-cont{0,67,266},
        grp-id 505
    }
}
test{
    test1{
        val
    }
}