I have log file similar to this format
test {
seq-cont {
0,
67,
266
},
grp-id 505
}
}
test{
test1{
val
}
}
Here is the echo command to produce that output
$ echo -e "test {\nseq-cont {\n\t\t\t0,\n\t\t\t67,\n\t\t\t266\n\t\t\t},\n\t\tgrp-id 505\n\t}\n}\ntest{\n\ttest1{\n\t\tval\n\t}\n}\n"
Question is how to remove all whitespace between seq-cont {
and the next }
that may be multiple in the file.
I want the output to be like this. Preferably use sed to produce the output.
test{seq-cont{0,67,266},
grp-id 505
}
}
test{
test1{
val
}
}
Efforts by OP: Here is the one somewhat worked but not exactly what I wanted:
sed ':a;N;/{/s/[[:space:]]\ //;/}/s/}/}/;ta;P;D' logfile
CodePudding user response:
It can be done using gnu-awk
with a custom RS
regex that matches {
and closing }
:
awk -v RS='{[^}] }' 'NR==1 {gsub(/[[:space:]] /, "", RT)} {ORS=RT} 1' file
test {seq-cont{0,67,266},
grp-id 505
}
}
test{
test1{
val
}
}
Here:
NR==1 {gsub(/[[:space:]] /, "", RT)}
: For the first record replace all whitespaces (including line breaks) with empty string.{ORS=RT}
: SetORS
to whatever text we captured inRS
PS: Remove NR==1
if you want to do this for entire file.
CodePudding user response:
With your shown samples, please try following awk
program. Tested and written in GNU awk
.
awk -v RS= '
match($0,/{\nseq-cont {\n[^}]*/){
val=substr($0,RSTART,RLENGTH)
gsub(/[[:space:]] /,"",val)
print substr($0,1,RSTART-1) val substr($0,RSTART RLENGTH)
}
' Input_file
Explanation: Simple explanation would be, using RS
capability to set it to null. Then using match
function of awk
to match everything between seq-cont {
to till next occurrence of }
. Removing all spaces, new lines in matched value. Finally printing all the values including newly edited values to get expected output mentioned by OP.
CodePudding user response:
You can do that much easier with perl
:
perl -0777 -i -pe 's/\s (seq-cont\s*\{[^}]*\})/$1=~s|\s ||gr/ge' logfilepath
The -0777
option tells perl
to slurp the file into a single string, -i
saves changes inline, \s (seq-cont\s*\{[^}]*\})
regex matches one or more whitespaces, then captures into Group 1 ($1
) seq-cont
, zero or more whitespaces, and then a substring between the leftmost {
and the next }
char ([^}]*
matches zero or more chars other than }
) and then all one or more whitespace character chunks (matched with \s
) are removed from the whole Group 1 value ($1
) (this second inner replacement is enabled with e
flag). All occurrences are handled due to the g
flag (next to e
).
See the online demo:
#!/bin/bash
s=$(echo -e "test {\nseq-cont {\n\t\t\t0,\n\t\t\t67,\n\t\t\t266\n\t\t\t},\n\t\tgrp-id 505\n\t}\n}\ntest{\n\ttest1{\n\t\tval\n\t}\n}\n")
perl -0777 -pe 's/\s (seq-cont\s*\{[^}]*\})/$1=~s|\s ||gr/ge' <<< "$s"
Output:
test {seq-cont{0,67,266},
grp-id 505
}
}
test{
test1{
val
}
}