I'm somewhat confused by yaml parsing results. I made a test.yaml and the results are the same.
val_1: 05334000
val_2: 2345784
val_3: 0537380
str_1: foobar
val_4: 05798
val_5: 051342123
Parsing that with:
import yaml
with open('test.yaml', 'r', encoding='utf8') as f:
a = yaml.load(f, Loader=yaml.FullLoader)
returns:
{'val_1': 1423360,
'val_2': 2345784,
'val_3': '0537380',
'str_1': 'foobar',
'val_4': '05798',
'val_5': 10863699}
Why these values for val_1 and val_5? Is there something special?
In my real data with many yaml files there are values like val_1. For some they parsed correct but for some they don't? All starts with 05, followed by more numbers. Caused by the leading 0 results should be strings. But yaml parses something completely different.
If I read the yaml as textfile f.readlines()
, all is fine:
['val_1: 05334000\n',
'val_2: 2345784\n',
'val_3: 0537380\n',
'str_1: foobar\n',
'val_4: 05798\n',
'val_5: 051342123\n']
CodePudding user response:
Integers with a leading 0
are parsed as octal; in python you'd need to write them with a leading 0o
:
0o5334000 == 1423360
as for '0537380'
: as there is an 8
present as digit it can not be parsed as an octal number. therefore it remains a string.
if you want to get strings for all your entries you can use the BaseLoader
from io import StringIO
import yaml
file = StringIO("""
val_1: 05334000
val_2: 2345784
val_3: 0537380
str_1: foobar
val_4: 05798
val_5: 051342123
""")
dct = yaml.load(file, Loader=yaml.BaseLoader)
with that i get:
{'val_1': '05334000', 'val_2': '2345784', 'val_3': '0537380',
'str_1': 'foobar', 'val_4': '05798', 'val_5': '051342123'}