I have a DataFrame like this:
| jsonCol |
| ----------------------------------------------------|
| {"category":"a","items":["a","b","c","d","e","f"\]} |
| {"category":"b","items":["u","v","w","x","y"\]} |
| {"category":"c","items":["p","q"\]} |
| {"category":"d","items":["m"\]} |
I converted it to strings of dicts:
x = pd.Series(', '.join(df_list['json_col'].to_list()), name='text')
The resultant is like below:
'{"category":"a","items":["a","b","c","d","e","f"]},
{"category":"b","items":["u","v","w","x","y"]},
{"category":"c","items":["p","q"]},
{"category":"d","items":["m"]}'
I am required to write a python function that takes an item as an input and return the top 3 items from the list where it belongs to (excluding itself). Items are in sequence of priority so top 3 is top first items.
def item_list(above_json_input, item = "a"):
return list
For example the result list should follow the following rules:
- If the item is "a" then iterate through category - a where item a is present and return top 3 items in the sequence - ["b","c","d"]
- If the item is "w" then then iterate through category - b where item w is there and return - ["u","v","x"]
- If the item is "q" then look in category - c where item q is there and return - ["p"] because there are less than 3 top items other than q
- If the item is "m" then the returned list should look in category d where item q is there and return empty [] because there are no other items in that list to look for top items.
Same goes with an item which doesn't exist like item = "r" which is not there in any category. We can throw an error or return an empty list again.
I am not sure how to read the json and get the list of top items. Is this even possible?
CodePudding user response:
I fixed your JSON, as it was badly formatted. For input "c", ['a', 'b', 'd']
and ['p', 'q']
are printed:
import json
data_string = """{
"data" : [
{"category":"a","items":["a","b","c","d","e","f"]},
{"category":"b","items":["u","v","w","x","y"]},
{"category":"c","items":["p","q"]},
{"category":"d","items":["m"]}
]
}"""
data = json.loads(data_string)["data"]
user_input = input("Pick a letter: ")
found = False
for values in data:
if user_input in (values["category"], *values["items"]):
found = True
temp = [item for item in values["items"] if item != user_input]
print(temp[:3])
if not found:
print([])
CodePudding user response:
you could try this on your dataframe
import pandas as pd
df = pd.DataFrame({'jsonCol':[{"g":[]}]})
h = df['jsonCol']
def search(inm):
for item in h:
if inm in item['items']:
if len(item['items'])>3:
item['items'].pop(item['items'].index(inm))
return item['items'][:3]
if len(item['items'])<3:
item['items'].pop(item['items'].index(inm))
return item['items']
return []
print(search('r'))
hope that helps