Home > other >  Remove a middle-level field in Logstash
Remove a middle-level field in Logstash

Time:06-11

What I need

I'm using Logstash and trying to remove a field that is between the top-level field and the last field using the ruby filter. The top-level field name is always the same, only its subfields change.

The fields look something like this:

[topLevelField][fieldToRemove][fieldToKeep]

And I wanted it to be like this:

[topLevelField][fieldToKeep]

Also, it's not all the values in the middle field that will be removed, just a few specific cases, but I think I can solve this with a simple condition


What I've tried

Consider that the codes have all been tested within the filter:

filter {
  ruby {
    code => "
      # my code here
    "
  }
}

I've tried removing with the following codes that were just ignored:

if event.include? 'topLevelField.fieldToRemove'
  event.get('topLevelField.fieldToRemove').each { |key, value|
    event.set('topLevelField.#{key}', value)
  }
  event.remove('topLevelField.fieldToRemove')
end
baseField = event.get('topLevelField')
beRemoved = baseField.keys.select{ |key| key.to_s.match('^(fieldToRemove)$') }

beRemoved.each { |key, value|
    event.set('topLevelField.#{key}', value)
}
event.get('topLevelField').keys.each { |keyToRemove|
  if keyToRemove.to_s == 'fieldToRemove'
    event.get(keyToRemove).each { |keyToKeep, valueToKeep|
      event.set('topLevelField.#{keyToKeep}', valueToKeep)
    }
  end
}

I tried with .each do ... end instead of ".each { ... }" but the result was the same

I also tried with '[fieldName]' syntax but seems to be ignored too

Does anyone have any ideas?

CodePudding user response:

For the first one, in logstash a field inside another field is refered to as [foo][bar], unlike elasticsearch and kibana, where it would be foo.bar. For all three approaches, Ruby only does string magic inside double quotes. If you use

        code => '
            if event.include? "[topLevelField][fieldToRemove]"
                event.get("[topLevelField][fieldToRemove]").each { |key, value|
                    event.set("[topLevelField][#{key}]", value)
                }
                event.remove("[topLevelField][fieldToRemove]")
            end
        '

then you will get

"topLevelField" => {
    "fieldToKeep" => "Foo"
},

CodePudding user response:

With the help of a friend, we were able to figure out the problem in the tests I did. We have come to the following code:

filter {
  if [topLevelField][fieldToRemove] {
    ruby {
      code => '
        event.get("[topLevelField][fieldToRemove]").each { |key, value|
          event.set("[topLevelField][#{key}]", value)
        }
        event.remove("[topLevelField][fieldToRemove]")
      '
    }
  }
}

Resulting in [topLevelField][fieldToKeep]


Explanation of the code

Since I don't know how much the ruby plugin impacts on performance, I decided to check the existence of the field before calling the plugin.

if [topLevelField][fieldToRemove] { ... }

After a few more searches, I found that using double quotes is different from using single quotes in ruby. So to use the doubles, I started the plugin using single quotes

ruby {
  code => ' ... '
}

The correct syntax for accessing fields using ruby appears to be with bracketed names and not separated by dots. So I changed the parameters to follow this syntax. Additionally, ruby string interpolation doesn't work when it is encased in single quotation marks.

event.get("[topLevelField][fieldToRemove]").each { |key, value|
  event.set("[topLevelField][#{key}]", value)
}
event.remove("[topLevelField][fieldToRemove]")

When using single quotes the string interpolation does not happen, in fact the snippet that would do the interpolation is considered as part of the string. So the result of the above code would be "[topLevelField][#{key}]" instead of "[topLevelField][fieldToKeep]"

And that's it, I hope the explanation helps someone else with some similar problem.

  • Related