I'm using the Etree parser to edit an xml based config file. I'm able to read, find and edit the text I want to change. But when I write the whole data to a new file, the "°C" is written as "°C"
. I would like that to remain as is, i.e. "°C". Could somebody explain why the parser replaces it like this?
Example:
Original line: <parameter name="Temperature" Units=°C>30</parameter>
(Run python script, find "30" and set it to "200". Write the line again to a new file)
Edited line: <parameter name="Temperature" Units=°C>200</parameter>
Could somebody help understand this?
CodePudding user response:
Per the documentation, etree.tostring()
outputs ASCII-encoded strings by default, where the °
symbol cannot be represented except as an entity. To specify unicode output, use the encoding
parameter.
In [12]: string = '<parameter name="Temperature" Units="°C">30</parameter>'
In [13]: root = etree.fromstring(string)
In [14]: etree.tostring(root)
Out[14]: b'<parameter name="Temperature" Units="°C">30</parameter>'
In [15]: etree.tostring(root, encoding="unicode")
Out[15]: '<parameter name="Temperature" Units="°C">30</parameter>'