Home > Software engineering >  Regex Fortinet Log PHP
Regex Fortinet Log PHP

Time:06-11

I'm trying to parse a log fortinet in PHP. I taked a log example from the Fortinet's cookbook.

This is my code with the regex. I want to create an array that has the type of value as index and than his value. For example: [date]=>2019-05-10 [time]=>11:50:48 ... [srcip]=>172.16.200.254

$regex = '/[a-zA-Z] =[0-9]{4}-[0-9]{2}-[0-9]{2} [a-zA-Z] =[0-9]{2}:[0-9]{2}:[0-9]{2}(\\.[0-9]{1,3})? [a-zA-Z] ="[^"]*" [a-zA-Z] ="[a-zA-Z] " [a-zA-Z] ="[^"]*" [a-zA-Z] ="[^"]*" [a-zA-Z] ="[^"]*" [a-zA-Z] =[0-9]  [a-zA-Z] =\\b(?:(?:2(?:[0-4][0-9]|5[0-5])|[0-1]?[0-9]?[0-9])\\.){3}(?:(?:2([0-4][0-9]|5[0-5])|[0-1]?[0-9]?[0-9]))\\b [a-zA-Z] =[0-9]  [a-zA-Z] ="[^"]*" [a-zA-Z] ="[^"]*" [a-zA-Z] =\\b(?:(?:2(?:[0-4][0-9]|5[0-5])|[0-1]?[0-9]?[0-9])\\.){3}(?:(?:2([0-4][0-9]|5[0-5])|[0-1]?[0-9]?[0-9]))\\b [a-zA-Z] =[0-9]  [a-zA-Z] ="[^"]*" [a-zA-Z] ="[^"]*" [a-zA-Z] =[0-9]  [a-zA-Z] =[0-9]  [a-zA-Z] ="[^"]*" [a-zA-Z] =[0-9]  [a-zA-Z] ="[^"]*" [a-zA-Z] ="[^"]*" [a-zA-Z] ="[^"]*" [a-zA-Z] ="[^"]*" [a-zA-Z] ="[^"]*" [a-zA-Z] ="[^"]*" [a-zA-Z] =[0-9]  [a-zA-Z] =[0-9]  [a-zA-Z] =[0-9]  [a-zA-Z] =[0-9]  [a-zA-Z] =[0-9]  [a-zA-Z] ="[^"]*"/i';

$str = 'date=2019-05-10 time=11:50:48 logid="0001000014" type="traffic" subtype="local" level="notice" vd="vdom1" eventtime=1557514248379911176 srcip=172.16.200.254 srcport=62024 srcintf="port11" srcintfrole="undefined" dstip=172.16.200.2 dstport=443 dstintf="vdom1" dstintfrole="undefined" sessionid=107478 proto=6 action="server-rst" policyid=0 policytype="local-in-policy" service="HTTPS" dstcountry="Reserved" srccountry="Reserved" trandisp="noop" app="Web Management(HTTPS)" duration=5 sentbyte=1247 rcvdbyte=1719 sentpkt=5 rcvdpkt=6 appcat="unscanned"';

preg_match_all($regex, $str, $matches, PREG_SET_ORDER, 0);

var_dump($matches);

CodePudding user response:

Perhaps using a small pattern with a branch reset group will be sufficient, where group 1 contains the key and group 2 contains the value:

([^\s=] )=(?|"([^"]*)"|(\S ))

Regex demo

Example

$regex = '/([^\s=] )=(?|"([^"]*)"|(\S ))/';

$str = 'date=2019-05-10 time=11:50:48 logid="0001000014" type="traffic" subtype="local" level="notice" vd="vdom1" eventtime=1557514248379911176 srcip=172.16.200.254 srcport=62024 srcintf="port11" srcintfrole="undefined" dstip=172.16.200.2 dstport=443 dstintf="vdom1" dstintfrole="undefined" sessionid=107478 proto=6 action="server-rst" policyid=0 policytype="local-in-policy" service="HTTPS" dstcountry="Reserved" srccountry="Reserved" trandisp="noop" app="Web Management(HTTPS)" duration=5 sentbyte=1247 rcvdbyte=1719 sentpkt=5 rcvdpkt=6 appcat="unscanned"';

preg_match_all($regex, $str, $matches, PREG_SET_ORDER, 0);

$result = array_reduce($matches, function($carry, $item) {
    $carry[$item[1]] = $item[2];
    return $carry;
}, []);

print_r($result);

Output

Array
(
    [date] => 2019-05-10
    [time] => 11:50:48
    [logid] => 0001000014
    [type] => traffic
    [subtype] => local
    [level] => notice
    [vd] => vdom1
    [eventtime] => 1557514248379911176
    [srcip] => 172.16.200.254
    [srcport] => 62024
    [srcintf] => port11
    [srcintfrole] => undefined
    [dstip] => 172.16.200.2
    [dstport] => 443
    [dstintf] => vdom1
    [dstintfrole] => undefined
    [sessionid] => 107478
    [proto] => 6
    [action] => server-rst
    [policyid] => 0
    [policytype] => local-in-policy
    [service] => HTTPS
    [dstcountry] => Reserved
    [srccountry] => Reserved
    [trandisp] => noop
    [app] => Web Management(HTTPS)
    [duration] => 5
    [sentbyte] => 1247
    [rcvdbyte] => 1719
    [sentpkt] => 5
    [rcvdpkt] => 6
    [appcat] => unscanned
)

CodePudding user response:

It sounds absurd, but your log is like HTML attributes, creating a html and parsing the attributes works fine.

<?php

$str = '
date=2019-05-10 time=11:50:48 logid="0001000014" type="traffic" subtype="local" level="notice" vd="vdom1" eventtime=1557514248379911176 srcip=172.16.200.254 srcport=62024 srcintf="port11" srcintfrole="undefined" dstip=172.16.200.2 dstport=443 dstintf="vdom1" dstintfrole="undefined" sessionid=107478 proto=6 action="server-rst" policyid=0 policytype="local-in-policy" service="HTTPS" dstcountry="Reserved" srccountry="Reserved" trandisp="noop" app="Web Management(HTTPS)" duration=5 sentbyte=1247 rcvdbyte=1719 sentpkt=5 rcvdpkt=6 appcat="unscanned"
date=2020-05-10 time=11:50:48 logid="0001000015" type="traffic2" subtype="local2" level="notice2" vd="vdom12" eventtime=15575142483799111762 srcip=172.16.200.2542 srcport=620242 srcintf="port112" srcintfrole="undefined2" dstip=172.16.200.22 dstport=4432 dstintf="vdom12" dstintfrole="undefined2" sessionid=1074782 proto=62 action="server-rst2" policyid=02 policytype="local-in-policy2" service="HTTPS2" dstcountry="Reserved2" srccountry="Reserved2" trandisp="noop2" app="Web Management(HTTPS)2" duration=52 sentbyte=12472 rcvdbyte=17192 sentpkt=52 rcvdpkt=62 appcat="unscanned2"
';

$lines = preg_split("/\n/", $str);
$lines = array_filter($lines);

$html = "<div>\n";
foreach($lines as $line)
    $html.= "\t<tag {$line}></tag>\n";
$html.= "</div>\n";

$html = load_html($html);
$xpath = new DOMXpath($html);

$tags = $xpath->query("//tag");

$result = [];
$i = 0;
foreach($tags as $tag)
{
    if ($tag->hasAttributes())
    {
        foreach ($tag->attributes as $attr)
        {
            $name = $attr->nodeName;
            $value = $attr->nodeValue;
            $result[$i][$name] = $value;
        }
        $i  ;
    }
}
print_r($result);

function load_html($str)
{
        //html-a DOM-an kargatu
        $dom = new DOMDocument('1.0', 'utf-8');
        $dom->preserveWhiteSpace = false;
        //@$dom->loadHTML("<?xml encoding=\"UTF-8\">".utf8_decode($str));
        @$dom->loadHTML("<?xml encoding=\"UTF-8\">".$str);
        $dom->formatOutput = true;
        // dirty fix
        foreach ($dom->childNodes as $item)
        {
            if ($item->nodeType == XML_PI_NODE)
                $dom->removeChild($item); // remove hack
        }
        return $dom;
}

Output:

Array
(
    [0] => Array
        (
            [date] => 2019-05-10
            [time] => 11:50:48
            [logid] => 0001000014
            [type] => traffic
            [subtype] => local
            [level] => notice
            [vd] => vdom1
            [eventtime] => 1557514248379911176
            [srcip] => 172.16.200.254
            [srcport] => 62024
            [srcintf] => port11
            [srcintfrole] => undefined
            [dstip] => 172.16.200.2
            [dstport] => 443
            [dstintf] => vdom1
            [dstintfrole] => undefined
            [sessionid] => 107478
            [proto] => 6
            [action] => server-rst
            [policyid] => 0
            [policytype] => local-in-policy
            [service] => HTTPS
            [dstcountry] => Reserved
            [srccountry] => Reserved
            [trandisp] => noop
            [app] => Web Management(HTTPS)
            [duration] => 5
            [sentbyte] => 1247
            [rcvdbyte] => 1719
            [sentpkt] => 5
            [rcvdpkt] => 6
            [appcat] => unscanned
        )

    [1] => Array
        (
            [date] => 2020-05-10
            [time] => 11:50:48
            [logid] => 0001000015
            [type] => traffic2
            [subtype] => local2
            [level] => notice2
            [vd] => vdom12
            [eventtime] => 15575142483799111762
            [srcip] => 172.16.200.2542
            [srcport] => 620242
            [srcintf] => port112
            [srcintfrole] => undefined2
            [dstip] => 172.16.200.22
            [dstport] => 4432
            [dstintf] => vdom12
            [dstintfrole] => undefined2
            [sessionid] => 1074782
            [proto] => 62
            [action] => server-rst2
            [policyid] => 02
            [policytype] => local-in-policy2
            [service] => HTTPS2
            [dstcountry] => Reserved2
            [srccountry] => Reserved2
            [trandisp] => noop2
            [app] => Web Management(HTTPS)2
            [duration] => 52
            [sentbyte] => 12472
            [rcvdbyte] => 17192
            [sentpkt] => 52
            [rcvdpkt] => 62
            [appcat] => unscanned2
        )

)
  • Related