Home > Enterprise >  Parse xml with python with xml.etree
Parse xml with python with xml.etree

Time:10-06

I've got xml such as:

<Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event">
<System>
<Provider Name="Microsoft-Windows-Security-Auditing" Guid="{54849625-5478-4994-A5BA-3E3B0328C30D}" /> 
 <EventID>4771</EventID> 
 <Version>0</Version> 
 <Level>0</Level> 
 <Task>14339</Task> 
 <Opcode>0</Opcode> 
 <Keywords>0x8010000000000000</Keywords> 
 <TimeCreated SystemTime="2015-08-07T18:10:21.495462300Z" /> 
 <EventRecordID>166708</EventRecordID> 
 <Correlation /> 
 <Execution ProcessID="520" ThreadID="1084" /> 
 <Channel>Security</Channel> 
 <Computer>DC01.contoso.local</Computer> 
 <Security /> 
 </System>
 <EventData>
 <Data Name="TargetUserName">dadmin</Data> 
 <Data Name="TargetSid">S-1-5-21-3457937927-2839227994-823803824-1104</Data> 
 <Data Name="ServiceName">krbtgt/CONTOSO.LOCAL</Data> 
 <Data Name="TicketOptions">0x40810010</Data> 
 <Data Name="Status">0x10</Data> 
 <Data Name="PreAuthType">15</Data> 
 <Data Name="IpAddress">::ffff:10.0.0.12</Data> 
 <Data Name="IpPort">49254</Data> 
 <Data Name="CertIssuerName" /> 
 <Data Name="CertSerialNumber" /> 
 <Data Name="CertThumbprint" /> 
 </EventData>
 </Event>

And i want to read data from this xml, using list of input() tags, as EventID, Computer, TargetUserName, IpAddress, etc.

I've got some code, which can parse System, but how to parse EventData cildnodes and childnodes values?:

from tkinter.font import names
import xml.etree.ElementTree as ET

tree = ET.parse('C:/tmp/test.xml')
root = tree.getroot()

namespace = "{" root.tag.split('}')[0].strip('{') "}"

for child in root:
  print(child.tag, child.attrib)
  print((''.join([child.tag for child in child.iter()])).replace(namespace,"| "))

print("\n\n\n\n")
testlist = ["EventID", "Computer"]
i = 0
for tag in testlist:
  for tag in root.iter(namespace tag): #'EventID'
    print(testlist[i],":",tag.text)
    i  = 1

The result I need to get:

System: Provider Name | EventID | Version | Level | Task | Opcode | Keywords | TimeCreated | EventRecordID | ProcessID | ThreadID | Channel | Computer
EventData: TargetUserName | TargetSid | ServiceName | TicketOptions | Status | PreAuthType | IpAddress | IpPort | CertIssuerName | CertSerialNumber | CertThumbprint
    
EventID : 4771
Status : 0x10
PreAuthType : 15
Computer : DC01.contoso.local
TimeCreated : 2015-08-07T18:10:21.495462300Z
TargetUserName : dadmin
ServiceName : krbtgt/CONTOSO.LOCAL
IpAddress : ::ffff:10.0.0.12
IpPort : 49254

CodePudding user response:

My current colution:

from tkinter.font import names
import xml.etree.ElementTree as ET
from datetime import datetime

tree = ET.parse('C:/tmp/test.xml')
root = tree.getroot()

namespace = "{" root.tag.split('}')[0].strip('{') "}"

for system in root.iter(namespace 'System'):
  print('System: ', ''.join({x.tag for x in root.findall(system.tag "/*")}).replace(namespace,", "))

data = ""
for eventdata in root.iter(namespace 'Data'):
    data = data   str(eventdata.attrib).replace("{'Name': '","").replace("'}",", ")
print('EventData: ',data)

print("\nEnter all tags to search separated by commas or leave blank to use default list: EventID, TimeCreated, Computer, TargetUserName, IpAddress, IpPort")
list = input().split(',')
if list == ['']:
  list = "EventID", "TimeCreated", "Computer","TargetUserName", "IpAddress", "IpPort" 

i = 0
for searchtag in list:
  for tag in root.iter(namespace searchtag):
    if searchtag == "TimeCreated":
      time = str(tag.attrib).replace("{'SystemTime': '","").replace("'}","")
      print(list[i], datetime.strptime(time.split(".", 3)[0], '%Y-%m-%dT%H:%M:%S'))
      
    else:
      print(list[i],":", tag.text)
    i  = 1

for searchtag in list:      
  for tag in root.iter(namespace 'Data'):
    if (str(tag.attrib).replace("{'Name': '","").replace("'}","") == searchtag):
      print(str(tag.attrib).replace("{'Name': '","").replace("'}",""),":", tag.text)

print("\n\n")
  • Related