How do I delete most keys in a json file using python-CodePudding

I have a json file with all the data I searched using twitter API, I only want to keep certain keys in that file and delete the rest of them but don't know how to do it correctly. so far this is my code and it gave me an error list indices must be integers or slices, not str

import json

def read_json(json_file:str)->list:
    tweets_data=[]
    for tweet in open(json_file, 'r'):
        tweets_data.append(json.loads(tweet))
    return tweets_data
tags = ["created_at", "full_text"]
tweet_list=read_json('test.json')

for i in tweet_list["tweet"].keys():
    if i not in tags:
        del tweet_list["tweet"][i]
        
print (tweet_list[0])

CodePudding user response：

def read_json(json_file:str)->list:
    tweets = []
    with open(json_file, 'r') as f:
        for line in f:
            tweets.append(json.loads(line))
    return tweets
tags = ['created_at', 'full_text']
tweets = read_json('tweets.json')
for tweet in tweets:
    for key in list(tweet.keys()):
        if key not in tags:
            del tweet[key]

CodePudding user response：

You've gotten really close to something working, but have a bug in the filtering section.

Your read_json() function returns a python list, which is assigned to tweet_list.

The error message refers to tweet_list['tweet'], and says that you must use integers or pairs of numbers (slices) to select from within a list. Not a string tweet.

If you change your code to use the number 0

for i in tweet_list[0].keys():
    if i not in tags:
        del tweet_list[0][i]

It will filter out the unwanted tags from the first element in the list.

To filter every element in the list, you need to iterate over all them with aother for loop.

for i in range(tweet_list):
    for key in tweet_list[i].keys():
        if key not in tags:
            del tweet_list[i][key]