Home > OS >  How to replace newlines between brackets
How to replace newlines between brackets

Time:10-06

I have log file similar to this format

test {
seq-cont {
                        0,
                        67,
                        266
                        },
                grp-id 505
        }
}
test{
        test1{
                val
        }
}

Here is the echo command to produce that output

$ echo -e "test {\nseq-cont {\n\t\t\t0,\n\t\t\t67,\n\t\t\t266\n\t\t\t},\n\t\tgrp-id 505\n\t}\n}\ntest{\n\ttest1{\n\t\tval\n\t}\n}\n"

Question is how to remove all whitespace between seq-cont { and the next } that may be multiple in the file.

I want the output to be like this. Preferably use sed to produce the output.

test{seq-cont{0,67,266},
                       grp-id 505
        }
}
test{
        test1{
                val
        }
}

Efforts by OP: Here is the one somewhat worked but not exactly what I wanted:

sed ':a;N;/{/s/[[:space:]]\ //;/}/s/}/}/;ta;P;D' logfile

CodePudding user response:

It can be done using gnu-awk with a custom RS regex that matches { and closing }:

awk -v RS='{[^}] }' 'NR==1 {gsub(/[[:space:]] /, "", RT)} {ORS=RT} 1' file

test {seq-cont{0,67,266},
                grp-id 505
        }
}
test{
        test1{
                val
        }
}

Here:

  • NR==1 {gsub(/[[:space:]] /, "", RT)}: For the first record replace all whitespaces (including line breaks) with empty string.
  • {ORS=RT}: Set ORS to whatever text we captured in RS

PS: Remove NR==1 if you want to do this for entire file.

CodePudding user response:

With your shown samples, please try following awk program. Tested and written in GNU awk.

awk -v RS= '
match($0,/{\nseq-cont {\n[^}]*/){
  val=substr($0,RSTART,RLENGTH)
  gsub(/[[:space:]] /,"",val)
  print substr($0,1,RSTART-1) val substr($0,RSTART RLENGTH)
}
'  Input_file

Explanation: Simple explanation would be, using RS capability to set it to null. Then using match function of awk to match everything between seq-cont { to till next occurrence of }. Removing all spaces, new lines in matched value. Finally printing all the values including newly edited values to get expected output mentioned by OP.

CodePudding user response:

You can do that much easier with perl:

perl -0777 -i -pe 's/\s (seq-cont\s*\{[^}]*\})/$1=~s|\s ||gr/ge' logfilepath

The -0777 option tells perl to slurp the file into a single string, -i saves changes inline, \s (seq-cont\s*\{[^}]*\}) regex matches one or more whitespaces, then captures into Group 1 ($1) seq-cont, zero or more whitespaces, and then a substring between the leftmost { and the next } char ([^}]* matches zero or more chars other than }) and then all one or more whitespace character chunks (matched with \s ) are removed from the whole Group 1 value ($1) (this second inner replacement is enabled with e flag). All occurrences are handled due to the g flag (next to e).

See the online demo:

#!/bin/bash
s=$(echo -e "test {\nseq-cont {\n\t\t\t0,\n\t\t\t67,\n\t\t\t266\n\t\t\t},\n\t\tgrp-id 505\n\t}\n}\ntest{\n\ttest1{\n\t\tval\n\t}\n}\n")
perl -0777 -pe 's/\s (seq-cont\s*\{[^}]*\})/$1=~s|\s ||gr/ge' <<< "$s"

Output:

test {seq-cont{0,67,266},
        grp-id 505
    }
}
test{
    test1{
        val
    }
}
  • Related