Home > OS >  xml filter on nested object using ruby
xml filter on nested object using ruby


I have below xml format log file

            <objectMessage>Internal> message shown here in multiple lines</objectMessage>
                <objectMessage>Internal> message shown here in multiple lines</objectMessage>

I am wanting to mutate the xml tag <objectMessage>message in multiples lines</objectMessage> into <objectMessage>MESSAGE HAS BEEN REMOVED</objectMessage> ONLY when <objectMessage> tag is inside <Item> tag

I have below part of the config which can look through and mutate the xml into the the message that i want

<objectMessage>Internal> message shown here in multiple lines</objectMessage>


filter {
 mutate {
  gsub => [
    "some regex pattern can do the xml tag filtering", "MESSAGE HAS BEEN REMOVED"


However, this will change all the <objectMessage> message shown here in multiple lines</objectMessage> including the one outside of <Item> field

I know using ruby plugin can do a better job and shouldn't be using regex for xml parsing at all. but this is the closest i can land on so far.

CodePudding user response:

Ideally you want to use the built in xml filter plugin, it is way more reliable and maintanable:


The following conf file will parse the XML and replace the values for the inner object:

input {
    generator {
        lines => [
            <objectMessage>Internal> message shown here in multiple lines</objectMessage>
            <objectMessage>Internal> message shown here in multiple lines</objectMessage>
                <objectMessage>Internal> message shown here in multiple lines</objectMessage>
        count => 1

filter {
    xml {
        source => "message"
        target => "xml"
        store_xml => true
        remove_field => ["message"]

filter {
  ruby {
    code => '
      event.get("[xml][Site][0][Item]").each_with_index do |item, index|
        event.set("[xml][Site][0][Item][#{index}]", "REMOVED MESSAGE")

output {
    stdout {
        codec => rubydebug


          "host" => {
        "name" => "Mac-Studio.local"
      "@version" => "1",
    "@timestamp" => 2022-11-28T13:47:31.352282Z,
         "event" => {
        "original" => "<QuerySiteInformation>\n            xmlns=\"http://www.example.com\"\n            <Site>\n            <id>abc-cde-fvvvv</id>\n            <Item>\n            <id>e5753ead-d202-451e-92cc-ea49d0a6bdf5</id>\n            <code>67448833344443</code>\n            <objectMessage>Internal> message shown here in multiple lines</objectMessage>\n            <reference>/</reference>\n            </Item>\n            <Item>\n            <id>e5753ead-d202-451e-92cc-ea49d0a6bdf5</id>\n            <code>67448833344443</code>\n            <objectMessage>Internal> message shown here in multiple lines</objectMessage>\n            <reference>/</reference>\n            </Item>\n            </Site>\n            <SiteInteraction>\n            <InteractionItem>\n            <Location>\n                <id>8496940--2842047577555</id>\n                <objectMessage>Internal> message shown here in multiple lines</objectMessage>\n            </Location>\n            </InteractionItem>\n            </SiteInteraction>\n        </QuerySiteInformation>",
        "sequence" => 0
           "xml" => {
                "content" => [
            [0] "\n            xmlns=\"http://www.example.com\"\n            ",
            [1] "\n            ",
            [2] "\n        "
                   "Site" => [
            [0] {
                  "id" => [
                    [0] "abc-cde-fvvvv"
                "Item" => [
                    [0] "REMOVED MESSAGE",
                    [1] "REMOVED MESSAGE"
        "SiteInteraction" => [
            [0] {
                "InteractionItem" => [
                    [0] {
                        "Location" => [
                            [0] {
                                           "id" => [
                                    [0] "8496940--2842047577555"
                                "objectMessage" => [
                                    [0] "Internal> message shown here in multiple lines"
  • Related