Home > Net >  Regex for extracting KEY=VALUE pairs from a log string in java
Regex for extracting KEY=VALUE pairs from a log string in java

Time:11-24

I have a log string like this :

String s0 = "DC696,\"/xi/ajax/remoting/call/plaincall/adhocReportBuilderControllerProxy.getRortList.dwr\",\"2222-11-10 08:32:22,351               PLV=REQ CIP=9.9.9.7 CMID=syairp CMN=\"\"Dub Airport Corporation Limited\"\" SN=sfv4_APM180885. DPN=dbPool66HFT01 UID=3862D04108 UN=91F6025D47F01D IUID=1931 LOC=en_GB EID=\"\"EVENT-UNKNOWN-UNKNOWN-ob55abe0118-201110083217-396080\"\" AGN=\"\"[Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/107.0.0.0 Safari/537.36 Edg/107.0.1418.35]\"\" RID=REQ-[7274545]  MTD=POST URL=\"\"/xi/ajax/remoting/call/plaincall/adhocRrtBuilderCoollerProxy.getRtList.dwr\"\" RQT=2835 MID=ADIN PID=ADMIN PQ=ADIN_PAGE SUB=0 MEM=2331036 CPU=2410 UCPU=2300 SCPU=110 FRE=10 FWR=0 NRE=2281 NWR=218 SQLC=43 SQLT=142 RPS=200 SID=60826A3FAB005A8A9B930177C5******.pc6bc1029 GID=e262dde6d0e040070b58afd4c8 HSID=ddc665538db779508d3213c0bb63bcb1c49fe8236d5f0884ae975915728e61 CSL=CRITICAL CCON=0 CSUP=0 CLOC=0 CEXT=0 CREM=0 STK={\"\"n\"\":\"\"/xi/ajax/remoting/call/plaincall/adhocReportBuilderControllerProxy.getrtList.dwr\"\",\"\"i\"\":1,\"\"t\"\":2835,\"\"slft\"\":2679,\"\"sub\"\":[{\"\"n\"\":\"\"SQL:select * from sfv4_HOUA180885.REPORT_DEF WHERE REPORT_DEF_ID IN (SELECT REPORT_DEF_ID FROM sfv4_HA80885.REPORT_DTASET WHERE REPORT_ID=?) AND DELETED=? ORDER BY REPORT_DEF_ID asc NULLS LAST\"\",\"\"i\"\":17,\"\"t\"\":40,\"\"slft\"\":40,\"\"st\"\":337,\"\"m\"\":220958,\"\"nr\"\":154,\"\"rt\"\":0,\"\"rn\"\":22,\"\"fs\"\":0}]}   \",\"2022-11-09T21:32:22.351 0000\",p66cf1029,\"dc606_ss_application\",1,\"/app/tomcat/logs/pef.log\",\"perf_log_yxx\",swsskix13";

I want to extract the KEY=VALUE pairs like {PLV=REQ, CIP=9.9.9.7,CMN="Dub Airport Corporation Limited", STK={...} }. into a Map<String,String>

I attempted with this, which does not work

String[] str1= str.split("\\s(?=(([^\"]*\"))*[^\"]*$)\\s*");
System.out.println("Value of split string is "  Arrays.toString(str1));

Any inputs will be of great help please.

CodePudding user response:

You can use this solution:

String s0 = "DC696,\"/xi/ajax/remoting/call/plaincall/adhocReportBuilderControllerProxy.getRortList.dwr\",\"2222-11-10 08:32:22,351               PLV=REQ CIP=9.9.9.7 CMID=syairp CMN=\"\"Dub Airport Corporation Limited\"\" SN=sfv4_APM180885. DPN=dbPool66HFT01 UID=3862D04108 UN=91F6025D47F01D IUID=1931 LOC=en_GB EID=\"\"EVENT-UNKNOWN-UNKNOWN-ob55abe0118-201110083217-396080\"\" AGN=\"\"[Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/107.0.0.0 Safari/537.36 Edg/107.0.1418.35]\"\" RID=REQ-[7274545]  MTD=POST URL=\"\"/xi/ajax/remoting/call/plaincall/adhocRrtBuilderCoollerProxy.getRtList.dwr\"\" RQT=2835 MID=ADIN PID=ADMIN PQ=ADIN_PAGE SUB=0 MEM=2331036 CPU=2410 UCPU=2300 SCPU=110 FRE=10 FWR=0 NRE=2281 NWR=218 SQLC=43 SQLT=142 RPS=200 SID=60826A3FAB005A8A9B930177C5******.pc6bc1029 GID=e262dde6d0e040070b58afd4c8 HSID=ddc665538db779508d3213c0bb63bcb1c49fe8236d5f0884ae975915728e61 CSL=CRITICAL CCON=0 CSUP=0 CLOC=0 CEXT=0 CREM=0 STK={\"\"n\"\":\"\"/xi/ajax/remoting/call/plaincall/adhocReportBuilderControllerProxy.getrtList.dwr\"\",\"\"i\"\":1,\"\"t\"\":2835,\"\"slft\"\":2679,\"\"sub\"\":[{\"\"n\"\":\"\"SQL:select * from sfv4_HOUA180885.REPORT_DEF WHERE REPORT_DEF_ID IN (SELECT REPORT_DEF_ID FROM sfv4_HA80885.REPORT_DTASET WHERE REPORT_ID=?) AND DELETED=? ORDER BY REPORT_DEF_ID asc NULLS LAST\"\",\"\"i\"\":17,\"\"t\"\":40,\"\"slft\"\":40,\"\"st\"\":337,\"\"m\"\":220958,\"\"nr\"\":154,\"\"rt\"\":0,\"\"rn\"\":22,\"\"fs\"\":0}]}   \",\"2022-11-09T21:32:22.351 0000\",p66cf1029,\"dc606_ss_application\",1,\"/app/tomcat/logs/pef.log\",\"perf_log_yxx\",swsskix13";
        String regex = "(\\w )=((?=\\{)(?:(?=.*?\\{(?!.*?\\3)(.*\\}(?!.*\\4).*))(?=.*?\\}(?!.*?\\4)(.*)).) ?.*?(?=\\3)[^{]*(?=\\4$)|\"{2}(.*?)\"{2}|(\\S ))";
Pattern p = Pattern.compile(regex);
Matcher m = p.matcher(s0);
Map<String, String> res = new HashMap<String, String>();
while(m.find()) {
    String val = m.group(2);
    if (m.group(5) != null) {
        val = m.group(5);
    }
    if (m.group(6) != null) {
        val = m.group(6);
    }
    res.put(m.group(1), val);
    System.out.println(m.group(1)   " => "   val   "\n----");
}

Output:

PLV => REQ
----
CIP => 9.9.9.7
----
CMID => syairp
----
CMN => Dub Airport Corporation Limited
----
SN => sfv4_APM180885.
----
DPN => dbPool66HFT01
----
UID => 3862D04108
----
UN => 91F6025D47F01D
----
IUID => 1931
----
LOC => en_GB
----
EID => EVENT-UNKNOWN-UNKNOWN-ob55abe0118-201110083217-396080
----
AGN => [Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/107.0.0.0 Safari/537.36 Edg/107.0.1418.35]
----
RID => REQ-[7274545]
----
MTD => POST
----
URL => /xi/ajax/remoting/call/plaincall/adhocRrtBuilderCoollerProxy.getRtList.dwr
----
RQT => 2835
----
MID => ADIN
----
PID => ADMIN
----
PQ => ADIN_PAGE
----
SUB => 0
----
MEM => 2331036
----
CPU => 2410
----
UCPU => 2300
----
SCPU => 110
----
FRE => 10
----
FWR => 0
----
NRE => 2281
----
NWR => 218
----
SQLC => 43
----
SQLT => 142
----
RPS => 200
----
SID => 60826A3FAB005A8A9B930177C5******.pc6bc1029
----
GID => e262dde6d0e040070b58afd4c8
----
HSID => ddc665538db779508d3213c0bb63bcb1c49fe8236d5f0884ae975915728e61
----
CSL => CRITICAL
----
CCON => 0
----
CSUP => 0
----
CLOC => 0
----
CEXT => 0
----
CREM => 0
----
STK => {""n"":""/xi/ajax/remoting/call/plaincall/adhocReportBuilderControllerProxy.getrtList.dwr"",""i"":1,""t"":2835,""slft"":2679,""sub"":[{""n"":""SQL:select * from sfv4_HOUA180885.REPORT_DEF WHERE REPORT_DEF_ID IN (SELECT REPORT_DEF_ID FROM sfv4_HA80885.REPORT_DTASET WHERE REPORT_ID=?) AND DELETED=? ORDER BY REPORT_DEF_ID asc NULLS LAST"",""i"":17,""t"":40,""slft"":40,""st"":337,""m"":220958,""nr"":154,""rt"":0,""rn"":22,""fs"":0}]}
----

See the regex demo.

Regex details:

  • (\w ) - Group 1: one or more word chars
  • = - a = char
  • ((?=\{)(?:(?=.*?\{(?!.*?\3)(.*\}(?!.*\4).*))(?=.*?\}(?!.*?\4)(.*)).) ?.*?(?=\3)[^{]*(?=\4$)|\"{2}(.*?)\"{2}|(\S )) - Group 2:
  • Related