I have a huge dataset I just wanted to remove these null things from my dataset Please do help me out
[
{
"Disease": "Impetigo",
"Symptoms": [
" skin_rash",
" high_fever",
" blister",
" red_sore_around_nose",
" yellow_crust_ooze",
null,
null,
null,
null,
null,
null,
null,
null,
null,
null,
null,
null
]
}
]
The output should be:
[
{
"Disease": "Impetigo",
"Symptoms": [
" skin_rash",
" high_fever",
" blister",
" red_sore_around_nose",
" yellow_crust_ooze",
]
}
]
CodePudding user response:
You can use filter()
to remove null
entries:
import json
with open('input.json') as input_file, open('output.json', 'w') as output_file:
data = json.load(input_file)
for disease in data:
disease['Symptoms'] = list(filter(lambda x: x is not None, disease['Symptoms']))
json.dump(data, output_file, indent=4)
CodePudding user response:
You can use list comprehension for this. Starting from the JSON you have and converting it into a dictionary, it would look something like this:
import json
data_json = '[ { "Disease": "Impetigo", "Symptoms": [ " skin_rash", " high_fever", " blister", " red_sore_around_nose", " yellow_crust_ooze", null, null, null, null, null, null, null, null, null, null, null, null ] } ]'
data_dict = json.loads(data_json)
# Now cleanup the key in each record
for record in data_dict:
item['Symptoms'] = [x for x in item['Symptoms'] if x is not None]