Home > Software engineering >  Integer type keys turned to string type after using json.dump()
Integer type keys turned to string type after using json.dump()

Time:08-21

I have a CSV file that has two columns, one for French words and one for English words:

French,English
partie,part
histoire,history
chercher,search
seulement,only
police,police

I tried to change the CSV data to JSON but faced an error. My index, which I wanted to be an integer, appeared to be a string. Tried to use int() also but looks like it won't fix this.

How can I change index values to integer inside the JSON file?

import json
import pandas
data = pandas.read_csv("data/french_words.csv")
words = {int(index): {
    "French": row.French,
    "English": row.English,
    "known": None
    } for index, row in data.iterrows()
}
data.update(words)
with open("words.json", mode="w") as words_file:
    json.dump(words, words_file, indent=4)

print(words)

Output:

{
0: {'French': 'partie', 'English': 'part', 'known': None}, 
1: {'French': 'histoire', 'English': 'history', 'known': None}, 
2: {'French': 'chercher', 'English': 'search', 'known': None}, 
3: {'French': 'seulement', 'English': 'only', 'known': None}, 
4: {'French': 'police', 'English': 'police', 'known': None}, 
5: {'French': 'pensais', 'English': 'thought', 'known': None},
}  

The result contains 100 key, value pairs. I just put first 5 here.

JSON file output

{
    "0": {
        "French": "partie",
        "English": "part",
        "known": null
    },
    "1": {
        "French": "histoire",
        "English": "history",
        "known": null
    },
    "2": {
        "French": "chercher",
        "English": "search",
        "known": null
    },
    "3": {
        "French": "seulement",
        "English": "only",
        "known": null
    },
    "4": {
        "French": "police",
        "English": "police",
        "known": null
    },
    "5": {
        "French": "pensais",
        "English": "thought",
        "known": null
    },
}

CodePudding user response:

The short answer is that you can't. The JSON standard defines the use of strings for object attribute names. So when you export, python is doing its best to produce useable JSON (by flipping the integer into a string).

The natural solution to this is to pull the CSV file into an array of objects, which will then export cleanly.

CodePudding user response:

This is one of those subtle differences among various mapping collections that can bite you. JSON treats keys as strings; Python supports distinct keys differing only in type.

In Python (and apparently in Lua) the keys to a mapping (dictionary or table, respectively) are object references. In Python they must be immutable types, or they must be objects which implement a __hash__ method. (The Lua docs suggest that it automatically uses the object's ID as a hash/key even for mutable objects and relies on string interning to ensure that equivalent strings map to the same objects).

JSON started as a Javascript serialization technology. (JSON stands for JavaScript Object Notation.) Naturally it implements semantics for its mapping notation which are consistent with its mapping semantics.

If both ends of your serialization are going to be Python then you'd be better off using pickles. If you really need to convert these back from JSON into native Python objects I guess you have a couple of choices. First you could try (try: ... except: ...) to convert any key to a number in the event of a dictionary look-up failure. Alternatively, if you add code to the other end (the serializer or generator of this JSON data) then you could have it perform a JSON serialization on each of the key values --- providing those as a list of keys. (Then your Python code would first iterate over the list of keys, instantiating/deserializing them into native Python objects ... and then use those for access the values out of the mapping).

  • Related