Home > Enterprise >  awk remove \n if next line doesn't match
awk remove \n if next line doesn't match

Time:12-10

awk 'tolower($0) ~ /\.[log(message|event)|trace(error)?c?|infoc?|warnc?|debugc?|errorc?]/,/)/{gsub(/^\t /, "", $0);print NR","$0}' example_file

I created this script that finds in a file for patterns like:

log.Info("hello world")
log.Error()

And outputs something like:

4,log.Info("hello world")
7,log.Error()

The line number and the text itself.

The thing is that if I have something like this on my file:

log.Info("hello world")
log.Warn(
    "hello world")
log.Error()

It will output something like this:

4,log.Info("hello world")
5,log.Warn(
6,"hello world")
7,log.Error()

I wanted to make "hello world") the same line as log.Warn(.

Like if the next line found doesn't start with the pattern /\.[log(message|event)|trace(error)?c?|infoc?|warnc?|debugc?|errorc?]/ it will put this line on the line before that.

The desired output would be something like:

4,log.Info("hello world")
5,log.Warn("hello world")
7,log.Error()

Thank you very much.

CodePudding user response:

Here's a best-effort script (i.e. will fail in various rainy-day cases), using this input file:

$ cat file
foo
log.Info("hello
        world")
log.Warn(
    "hello
                some other
        world")
log.Error()
bar

and any POSIX awk:

$ cat tst.awk
BEGIN {
    begRe = "log[.](Info|Warn|Error)[(]"
    regexp = begRe "[^)]*[)]"
    OFS = ","
}
$0 ~ begRe {
    begNr = NR
    buf = ""
}
begNr {
    buf = buf $0
    if ( match(buf,regexp) ) {
        buf = substr(buf,RSTART,RLENGTH)
        gsub(/[[:space:]]*"[[:space:]]*/,"\"",buf)
        print begNr, buf
        begNr = 0
    }
}

$ awk -f tst.awk file
2,log.Info("hello       world")
4,log.Warn("hello               some other      world")
8,log.Error()

if you want to collapse all the white space within quotes and remove any leading white space then just add gsub(/[[:space:]] /," ",buf); gsub(/^ | $/,"",buf) before the print statement.

$ cat tst.awk
BEGIN {
    begRe = "log[.](Info|Warn|Error)[(]"
    regexp = begRe "[^)]*[)]"
    OFS = ","
}
$0 ~ begRe {
    begNr = NR
    buf = ""
}
begNr {
    buf = buf $0
    if ( match(buf,regexp) ) {
        buf = substr(buf,RSTART,RLENGTH)
        gsub(/[[:space:]]*"[[:space:]]*/,"\"",buf)
        gsub(/[[:space:]] /," ",buf); gsub(/^ | $/,"",buf)
        print begNr, buf
        begNr = 0
    }
}

$ awk -f tst.awk file
2,log.Info("hello world")
4,log.Warn("hello some other world")
8,log.Error()

CodePudding user response:

Like if the next line found doesn't start with the pattern /.[log(message|event)|trace(error)?c?|infoc?|warnc?|debugc?|errorc?]/ it will put this line on the line before that.

You can't make actions depending on the next line, you can only make actions depending on the current line. Which basically means that you have to:

  • buffer one line (previous line)
  • if the current line does start with the pattern /.[log(message|event)|trace(error)?c?|infoc?|warnc?|debugc?|errorc?]/ output previous line. Previous line becomes current line.
  • otherwise, output previous line and current line. Previous line becopmes empty.
  • END { output previous line }

Something along:

awk '
    /^log\./{  # the pattern here
       if (last) {
         print NR - 1, last;  # output previous line
        }
       last=$0  # previous line becomes current line
       next
    }
    { # otherwise, because next above
       print NR - 1, last $0   # output previous line and current line
       last=""  # previous line becomes empty.
    }
    END{
      if (last) {
        print NR, last  # Handle previous line on the end.
      }
    }
'

Change your condition, so it depends on "current line" only. Like, if current line does not end with ), eat next line.

awk '/[^)]$/{
   n=NR
   a=$0
   getline
   print n " " a $0
}'
  • Related