Currently, I'm having an issue with a basic socket server. Essentially I have no control over the client for this server and the client is sending XML messages of an unknown length delimited by a known set of characters. Basic reproduction for this issue can be demonstrated with the following,
import socket
server_address = ('192.168.2.47', 10000)
#server
#client
def client():
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
sock.connect(server_address)
sock.send('<messageBody><ew32f/><dwadwa/></messageBody>')
sock.send('<messageBody><dwaaw/><fewwfe/></messageBody>')
sock.send('<messageBody><ewqf3x/><awdwad2/></messageBody>')
# the socket will stay connected so long as the client continues sending data which could be days or more
def server():
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
sock.bind(server_address)
client, addr = sock.accept(1)
# I need to find a way to receive the client data such that it stops receiving the the </messageBody> tag
I'm trying to find the most efficient method possible of going about this as the server may be receiving several hundred messages per second from various clients. the size of these messages could be between a few bytes and several kilobytes.
CodePudding user response:
I think Python's expat parser can help you; it allows streaming parsing and the chunks can be fragments of XML (like bar
in my example below).
I'm pretty sure I comprehend your issue and its context. Here's my attempt to show your server receiving this sample XML:
<root>
<foo />
<bar />
<messageBody>
<ewqf3x />
<awdwad2 />
</messageBody>
<baz />
</root>
but in chunks, as if the client were feeding you the entire XML body over many calls. Each chunk is parsed, and when the <messageBody/>
end-tag is read, an error is raised which is your signal that you have everything you need and can stop processing (listening?).
#!/usr/bin/env python3
import sys
from xml.parsers.expat import ParserCreate
class FoundMessageBodyEnd(Exception):
pass
def end_element(name):
print(f'Processing end-tag for {name}')
if name == 'messageBody':
# This may not be the right way to do this
raise FoundMessageBodyEnd
p = ParserCreate()
p.EndElementHandler = end_element
streaming_chunks = [
'''<root>
<foo />
<bar ''', # notice that bar is not closed till the first line of the next chunk
'''/>
<messageBody>
<ewqf3x />
<awdwad2 />''',
''' </messageBody>''',
''' <baz />
</root>''',
]
parsed = 0
for chunk in streaming_chunks:
try:
p.Parse(chunk)
parsed = 1
except FoundMessageBodyEnd:
print(f'After parsing {parsed 1} chunks, found messageBody delimiter, done.')
sys.exit(1)