I have a dictionary that has multiple values assigned to each key. For each list of values in each key, I am trying to find a percentage of how many fit the 'flexibility' criteria. Since the values are stings it is throwing me for a loop (pun not intended). I am trying to get one value that has the percentage of values that are either 'none' or 'flexible' out of the total values in the loop.
Basically if the dictionary looks like this:
dict1 = {'German' : ["None", "None" ,"Flexible", "Hard"],
"French" : ["Hard", "Hard", "Hard", "Hard"]
}
I want the code to give me this (rounding to 2 decimals is fine:
dict1 = {"German" : "0.75",
"French" : "1.00"
}
import pandas as pd
def course_prereq_flexibility(fn):
df = pd.read_csv(fn)
df2 = df[["area", "prereq_type"]].copy()
def percentages (df2):
dict1 = {}
for items in range(len(df2)):
key = df2.iloc[items, 0]
values = df2.iloc[items, 1]
dict1.setdefault(key, [])
dict1[key].append(values)
dict1
I am a bit confused on where to go from creating the dictonary and would really appreciate a walk through of the steps I could go through.
CodePudding user response:
Without using pandas, it's reasonably straightfoward to do this with just collections.Counter
.
>>> dict1 = {'German' : ["None", "None" ,"Flexible", "Hard"],
...
... "French" : ["Hard", "Hard", "Hard", "Hard"]
...
... }
>>>
>>> {k: c
... for k, v in dict1.items()
... for c in (Counter(v),)}
{'German': Counter({'None': 2, 'Flexible': 1, 'Hard': 1}), 'French': Counter({'Hard': 4})}
>>> {k: (c['None'] c['Flexible']) / len(v)
... for k, v in dict1.items()
... for c in (Counter(v),)}
{'German': 0.75, 'French': 0.0}
CodePudding user response:
There are a number of ways to achieve this. The following is one example:
dict1 = {
"German": ["None", "None", "Flexible", "Hard"],
"French": ["Hard", "Hard", "Hard", "Hard"]
}
def percentage_in_list(input_list, elements_to_find=None):
if elements_to_find is None:
elements_to_find = ["None", "Flexible"]
nr_found = len([x for x in input_list if x in elements_to_find])
return (nr_found / len(input_list)) * 100
percentages = {k: percentage_in_list(v) for k,v in dict1.items()}
print(percentages)
The function percentage_in_list
returns the percentage of values that corresponds to one of the values in elements_to_find
which in this case is set to "None" and "Flexible" by default. In the function, a list comprehension is used to filter out all the elements of the input_list that are in elements_to_find. The len
of the result of the list comprehension is the number of elements that have been found. Now, this number just has to be divided by the length of the input list and multiplied by 100 to return the percentage.
In the main code, a dictionary comprehension is used to iterate over dict1
and call the function percentage_in_list
for every value in the dictionary.