I have a log string like this :
String s0 = "DC696,\"/xi/ajax/remoting/call/plaincall/adhocReportBuilderControllerProxy.getRortList.dwr\",\"2222-11-10 08:32:22,351 PLV=REQ CIP=9.9.9.7 CMID=syairp CMN=\"\"Dub Airport Corporation Limited\"\" SN=sfv4_APM180885. DPN=dbPool66HFT01 UID=3862D04108 UN=91F6025D47F01D IUID=1931 LOC=en_GB EID=\"\"EVENT-UNKNOWN-UNKNOWN-ob55abe0118-201110083217-396080\"\" AGN=\"\"[Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/107.0.0.0 Safari/537.36 Edg/107.0.1418.35]\"\" RID=REQ-[7274545] MTD=POST URL=\"\"/xi/ajax/remoting/call/plaincall/adhocRrtBuilderCoollerProxy.getRtList.dwr\"\" RQT=2835 MID=ADIN PID=ADMIN PQ=ADIN_PAGE SUB=0 MEM=2331036 CPU=2410 UCPU=2300 SCPU=110 FRE=10 FWR=0 NRE=2281 NWR=218 SQLC=43 SQLT=142 RPS=200 SID=60826A3FAB005A8A9B930177C5******.pc6bc1029 GID=e262dde6d0e040070b58afd4c8 HSID=ddc665538db779508d3213c0bb63bcb1c49fe8236d5f0884ae975915728e61 CSL=CRITICAL CCON=0 CSUP=0 CLOC=0 CEXT=0 CREM=0 STK={\"\"n\"\":\"\"/xi/ajax/remoting/call/plaincall/adhocReportBuilderControllerProxy.getrtList.dwr\"\",\"\"i\"\":1,\"\"t\"\":2835,\"\"slft\"\":2679,\"\"sub\"\":[{\"\"n\"\":\"\"SQL:select * from sfv4_HOUA180885.REPORT_DEF WHERE REPORT_DEF_ID IN (SELECT REPORT_DEF_ID FROM sfv4_HA80885.REPORT_DTASET WHERE REPORT_ID=?) AND DELETED=? ORDER BY REPORT_DEF_ID asc NULLS LAST\"\",\"\"i\"\":17,\"\"t\"\":40,\"\"slft\"\":40,\"\"st\"\":337,\"\"m\"\":220958,\"\"nr\"\":154,\"\"rt\"\":0,\"\"rn\"\":22,\"\"fs\"\":0}]} \",\"2022-11-09T21:32:22.351 0000\",p66cf1029,\"dc606_ss_application\",1,\"/app/tomcat/logs/pef.log\",\"perf_log_yxx\",swsskix13";
I want to extract the KEY=VALUE pairs like {PLV=REQ, CIP=9.9.9.7,CMN="Dub Airport Corporation Limited", STK={...} }
. into a Map<String,String>
I attempted with this, which does not work
String[] str1= str.split("\\s(?=(([^\"]*\"))*[^\"]*$)\\s*");
System.out.println("Value of split string is " Arrays.toString(str1));
Any inputs will be of great help please.
CodePudding user response:
You can use this solution:
String s0 = "DC696,\"/xi/ajax/remoting/call/plaincall/adhocReportBuilderControllerProxy.getRortList.dwr\",\"2222-11-10 08:32:22,351 PLV=REQ CIP=9.9.9.7 CMID=syairp CMN=\"\"Dub Airport Corporation Limited\"\" SN=sfv4_APM180885. DPN=dbPool66HFT01 UID=3862D04108 UN=91F6025D47F01D IUID=1931 LOC=en_GB EID=\"\"EVENT-UNKNOWN-UNKNOWN-ob55abe0118-201110083217-396080\"\" AGN=\"\"[Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/107.0.0.0 Safari/537.36 Edg/107.0.1418.35]\"\" RID=REQ-[7274545] MTD=POST URL=\"\"/xi/ajax/remoting/call/plaincall/adhocRrtBuilderCoollerProxy.getRtList.dwr\"\" RQT=2835 MID=ADIN PID=ADMIN PQ=ADIN_PAGE SUB=0 MEM=2331036 CPU=2410 UCPU=2300 SCPU=110 FRE=10 FWR=0 NRE=2281 NWR=218 SQLC=43 SQLT=142 RPS=200 SID=60826A3FAB005A8A9B930177C5******.pc6bc1029 GID=e262dde6d0e040070b58afd4c8 HSID=ddc665538db779508d3213c0bb63bcb1c49fe8236d5f0884ae975915728e61 CSL=CRITICAL CCON=0 CSUP=0 CLOC=0 CEXT=0 CREM=0 STK={\"\"n\"\":\"\"/xi/ajax/remoting/call/plaincall/adhocReportBuilderControllerProxy.getrtList.dwr\"\",\"\"i\"\":1,\"\"t\"\":2835,\"\"slft\"\":2679,\"\"sub\"\":[{\"\"n\"\":\"\"SQL:select * from sfv4_HOUA180885.REPORT_DEF WHERE REPORT_DEF_ID IN (SELECT REPORT_DEF_ID FROM sfv4_HA80885.REPORT_DTASET WHERE REPORT_ID=?) AND DELETED=? ORDER BY REPORT_DEF_ID asc NULLS LAST\"\",\"\"i\"\":17,\"\"t\"\":40,\"\"slft\"\":40,\"\"st\"\":337,\"\"m\"\":220958,\"\"nr\"\":154,\"\"rt\"\":0,\"\"rn\"\":22,\"\"fs\"\":0}]} \",\"2022-11-09T21:32:22.351 0000\",p66cf1029,\"dc606_ss_application\",1,\"/app/tomcat/logs/pef.log\",\"perf_log_yxx\",swsskix13";
String regex = "(\\w )=((?=\\{)(?:(?=.*?\\{(?!.*?\\3)(.*\\}(?!.*\\4).*))(?=.*?\\}(?!.*?\\4)(.*)).) ?.*?(?=\\3)[^{]*(?=\\4$)|\"{2}(.*?)\"{2}|(\\S ))";
Pattern p = Pattern.compile(regex);
Matcher m = p.matcher(s0);
Map<String, String> res = new HashMap<String, String>();
while(m.find()) {
String val = m.group(2);
if (m.group(5) != null) {
val = m.group(5);
}
if (m.group(6) != null) {
val = m.group(6);
}
res.put(m.group(1), val);
System.out.println(m.group(1) " => " val "\n----");
}
Output:
PLV => REQ
----
CIP => 9.9.9.7
----
CMID => syairp
----
CMN => Dub Airport Corporation Limited
----
SN => sfv4_APM180885.
----
DPN => dbPool66HFT01
----
UID => 3862D04108
----
UN => 91F6025D47F01D
----
IUID => 1931
----
LOC => en_GB
----
EID => EVENT-UNKNOWN-UNKNOWN-ob55abe0118-201110083217-396080
----
AGN => [Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/107.0.0.0 Safari/537.36 Edg/107.0.1418.35]
----
RID => REQ-[7274545]
----
MTD => POST
----
URL => /xi/ajax/remoting/call/plaincall/adhocRrtBuilderCoollerProxy.getRtList.dwr
----
RQT => 2835
----
MID => ADIN
----
PID => ADMIN
----
PQ => ADIN_PAGE
----
SUB => 0
----
MEM => 2331036
----
CPU => 2410
----
UCPU => 2300
----
SCPU => 110
----
FRE => 10
----
FWR => 0
----
NRE => 2281
----
NWR => 218
----
SQLC => 43
----
SQLT => 142
----
RPS => 200
----
SID => 60826A3FAB005A8A9B930177C5******.pc6bc1029
----
GID => e262dde6d0e040070b58afd4c8
----
HSID => ddc665538db779508d3213c0bb63bcb1c49fe8236d5f0884ae975915728e61
----
CSL => CRITICAL
----
CCON => 0
----
CSUP => 0
----
CLOC => 0
----
CEXT => 0
----
CREM => 0
----
STK => {""n"":""/xi/ajax/remoting/call/plaincall/adhocReportBuilderControllerProxy.getrtList.dwr"",""i"":1,""t"":2835,""slft"":2679,""sub"":[{""n"":""SQL:select * from sfv4_HOUA180885.REPORT_DEF WHERE REPORT_DEF_ID IN (SELECT REPORT_DEF_ID FROM sfv4_HA80885.REPORT_DTASET WHERE REPORT_ID=?) AND DELETED=? ORDER BY REPORT_DEF_ID asc NULLS LAST"",""i"":17,""t"":40,""slft"":40,""st"":337,""m"":220958,""nr"":154,""rt"":0,""rn"":22,""fs"":0}]}
----
See the regex demo.
Regex details:
(\w )
- Group 1: one or more word chars=
- a=
char((?=\{)(?:(?=.*?\{(?!.*?\3)(.*\}(?!.*\4).*))(?=.*?\}(?!.*?\4)(.*)).) ?.*?(?=\3)[^{]*(?=\4$)|\"{2}(.*?)\"{2}|(\S ))
- Group 2:(?=\{)(?:(?=.*?\{(?!.*?\3)(.*\}(?!.*\4).*))(?=.*?\}(?!.*?\4)(.*)).) ?.*?(?=\3)[^{]*(?=\4$)
- a substring between two paired curly braces (adapted from Is it possible to match nested brackets with a regex without using recursion or balancing groups?)|
- or\"{2}(.*?)\"{2}
- two"
s, then any zero or more chars other than line break chars as few as possible (captured into Group 5), and then two"
s|
- or(\S )
- one or more non-whitespace chars (captured into Group 6)