Home > Software engineering >  Function to find top list of items for a given list in a JSON input
Function to find top list of items for a given list in a JSON input

Time:07-21

I have a DataFrame like this:

| jsonCol                                             |
| ----------------------------------------------------|
| {"category":"a","items":["a","b","c","d","e","f"\]} |
| {"category":"b","items":["u","v","w","x","y"\]}     |
| {"category":"c","items":["p","q"\]}                 |
| {"category":"d","items":["m"\]}                     |

I converted it to strings of dicts:

x = pd.Series(', '.join(df_list['json_col'].to_list()), name='text')

The resultant is like below:

'{"category":"a","items":["a","b","c","d","e","f"]},
{"category":"b","items":["u","v","w","x","y"]},
{"category":"c","items":["p","q"]},
{"category":"d","items":["m"]}'

I am required to write a python function that takes an item as an input and return the top 3 items from the list where it belongs to (excluding itself). Items are in sequence of priority so top 3 is top first items.

def item_list(above_json_input, item = "a"):
    return list

For example the result list should follow the following rules:

  1. If the item is "a" then iterate through category - a where item a is present and return top 3 items in the sequence - ["b","c","d"]
  2. If the item is "w" then then iterate through category - b where item w is there and return - ["u","v","x"]
  3. If the item is "q" then look in category - c where item q is there and return - ["p"] because there are less than 3 top items other than q
  4. If the item is "m" then the returned list should look in category d where item q is there and return empty [] because there are no other items in that list to look for top items.

Same goes with an item which doesn't exist like item = "r" which is not there in any category. We can throw an error or return an empty list again.

I am not sure how to read the json and get the list of top items. Is this even possible?

CodePudding user response:

I fixed your JSON, as it was badly formatted. For input "c", ['a', 'b', 'd'] and ['p', 'q'] are printed:

import json

data_string = """{
        "data" : [
                {"category":"a","items":["a","b","c","d","e","f"]},
                {"category":"b","items":["u","v","w","x","y"]},
                {"category":"c","items":["p","q"]},
                {"category":"d","items":["m"]}
        ]
}"""

data = json.loads(data_string)["data"]

user_input = input("Pick a letter: ")

found = False
for values in data:
        if user_input in (values["category"], *values["items"]):
                found = True
                temp = [item for item in values["items"] if item != user_input]
                print(temp[:3])

if not found:
        print([])

CodePudding user response:

you could try this on your dataframe

import pandas as pd

df = pd.DataFrame({'jsonCol':[{"g":[]}]})
h = df['jsonCol']


def search(inm):
    for item in h:
        if inm in item['items']:
            if len(item['items'])>3:
                item['items'].pop(item['items'].index(inm))
                return item['items'][:3]
            if len(item['items'])<3:
                item['items'].pop(item['items'].index(inm))
                return item['items']
    return []
        
print(search('r'))

hope that helps

  • Related