I have raw data that looks like this:
25023,Zwerg Mütze,0,1,986,3780
25871,red earth,0,1,38,8349
25931,K4m!k4z3,90,1,1539,2530
It is saved as a .txt file: https://de205.die-staemme.de/map/player.txt
The "characters" starting with % are unicode, as far as I can tell.
I found the following table about it: https://www.i18nqa.com/debug/utf8-debug.html
Here is my code so far:
urllib.urlretrieve(url,pfad "player.txt")
f = open(pfad "player.txt","r",encoding="utf-8")
raw = raw.split("\n")
f.close()
Python does not convert the %-characters. They are read as if they were seperate characters.
Is there a way to convert these characters without calling .replace like 200 times?
Thank you very much in advance for help and/or useful hints!
CodePudding user response:
The %s are URL-encoding; use urllib.parse.unquote
to decode the string.
>>> raw = """25023,Zwerg Mütze,0,1,986,3780
... 25871,red earth,0,1,38,8349
... 25931,K4m!k4z3,90,1,1539,2530"""
>>> import urllib.parse
>>> print(urllib.parse.unquote(raw))
25023,Zwerg Mütze,0,1,986,3780
25871,red earth,0,1,38,8349
25931,K4m!k4z3,90,1,1539,2530