I have below xml format log file
<QuerySiteInformation>
xmlns="http://www.example.com"
<Site>
<id>abc-cde-fvvvv</id>
<Item>
<id>e5753ead-d202-451e-92cc-ea49d0a6bdf5</id>
<code>67448833344443</code>
<objectMessage>Internal> message shown here in multiple lines</objectMessage>
<reference>/</reference>
</Item>
</Site>
<SiteInteraction>
<InteractionItem>
<Location>
<id>8496940--2842047577555</id>
<objectMessage>Internal> message shown here in multiple lines</objectMessage>
</Location>
</InteractionItem>
</SiteInteraction>
</QuerySiteInformation>
I am wanting to mutate the xml tag <objectMessage>message in multiples lines</objectMessage>
into <objectMessage>MESSAGE HAS BEEN REMOVED</objectMessage>
ONLY when <objectMessage>
tag is inside <Item>
tag
I have below part of the config which can look through and mutate the xml into the the message that i want
<objectMessage>Internal> message shown here in multiple lines</objectMessage>
config
filter {
mutate {
gsub => [
"some regex pattern can do the xml tag filtering", "MESSAGE HAS BEEN REMOVED"
]
}
}
However, this will change all the <objectMessage> message shown here in multiple lines</objectMessage>
including the one outside of <Item>
field
I know using ruby plugin can do a better job and shouldn't be using regex for xml parsing at all. but this is the closest i can land on so far.
CodePudding user response:
Ideally you want to use the built in xml filter plugin, it is way more reliable and maintanable:
https://www.elastic.co/guide/en/logstash/current/plugins-filters-xml.html
The following conf file will parse the XML and replace the values for the inner object:
input {
generator {
lines => [
'<QuerySiteInformation>
xmlns="http://www.example.com"
<Site>
<id>abc-cde-fvvvv</id>
<Item>
<id>e5753ead-d202-451e-92cc-ea49d0a6bdf5</id>
<code>67448833344443</code>
<objectMessage>Internal> message shown here in multiple lines</objectMessage>
<reference>/</reference>
</Item>
<Item>
<id>e5753ead-d202-451e-92cc-ea49d0a6bdf5</id>
<code>67448833344443</code>
<objectMessage>Internal> message shown here in multiple lines</objectMessage>
<reference>/</reference>
</Item>
</Site>
<SiteInteraction>
<InteractionItem>
<Location>
<id>8496940--2842047577555</id>
<objectMessage>Internal> message shown here in multiple lines</objectMessage>
</Location>
</InteractionItem>
</SiteInteraction>
</QuerySiteInformation>'
]
count => 1
}
}
filter {
xml {
source => "message"
target => "xml"
store_xml => true
remove_field => ["message"]
}
}
filter {
ruby {
code => '
event.get("[xml][Site][0][Item]").each_with_index do |item, index|
event.set("[xml][Site][0][Item][#{index}]", "REMOVED MESSAGE")
end
'
}
}
output {
stdout {
codec => rubydebug
}
}
Output:
{
"host" => {
"name" => "Mac-Studio.local"
},
"@version" => "1",
"@timestamp" => 2022-11-28T13:47:31.352282Z,
"event" => {
"original" => "<QuerySiteInformation>\n xmlns=\"http://www.example.com\"\n <Site>\n <id>abc-cde-fvvvv</id>\n <Item>\n <id>e5753ead-d202-451e-92cc-ea49d0a6bdf5</id>\n <code>67448833344443</code>\n <objectMessage>Internal> message shown here in multiple lines</objectMessage>\n <reference>/</reference>\n </Item>\n <Item>\n <id>e5753ead-d202-451e-92cc-ea49d0a6bdf5</id>\n <code>67448833344443</code>\n <objectMessage>Internal> message shown here in multiple lines</objectMessage>\n <reference>/</reference>\n </Item>\n </Site>\n <SiteInteraction>\n <InteractionItem>\n <Location>\n <id>8496940--2842047577555</id>\n <objectMessage>Internal> message shown here in multiple lines</objectMessage>\n </Location>\n </InteractionItem>\n </SiteInteraction>\n </QuerySiteInformation>",
"sequence" => 0
},
"xml" => {
"content" => [
[0] "\n xmlns=\"http://www.example.com\"\n ",
[1] "\n ",
[2] "\n "
],
"Site" => [
[0] {
"id" => [
[0] "abc-cde-fvvvv"
],
"Item" => [
[0] "REMOVED MESSAGE",
[1] "REMOVED MESSAGE"
]
}
],
"SiteInteraction" => [
[0] {
"InteractionItem" => [
[0] {
"Location" => [
[0] {
"id" => [
[0] "8496940--2842047577555"
],
"objectMessage" => [
[0] "Internal> message shown here in multiple lines"
]
}
]
}
]
}
]
}
}