Suppose my dictionary contains > 100 elements and one or two elements have values different than other values; most values are the same (12 in the below example). How can I remove these a few elements?
Diction = {1:12,2:12,3:23,4:12,5:12,6:12,7:12,8:2}
I want a dictionary object:
Diction = {1:12,2:12,4:12,5:12,6:12,7:12}
CodePudding user response:
It may be a bit slow because of the looping (especially as the size of the dictionary gets very large) and have to use numpy, but this will work
import numpy as np
Diction = {1:12,2:12,3:23,4:12,5:12,6:12,7:12,8:2}
dict_list = []
for x in Diction:
dict_list.append(Diction[x])
dict_array = np.array(dict_list)
unique, counts = np.unique(dict_array, return_counts=True)
most_common = unique[np.argmax(counts)]
new_Diction = {}
for x in Diction:
if Diction[x] == most_common:
new_Diction[x] = most_common
print(new_Diction)
Output
{1: 12, 2: 12, 4: 12, 5: 12, 6: 12, 7: 12}
CodePudding user response:
d = {1:12,2:12,3:23,4:12,5:12,6:12,7:12,8:2}
new_d = {}
unique_values = []
unique_count = []
most_occurence = 0
# Find unique values
for k, v in d.items():
if v not in unique_values:
unique_values.append(v)
# Count their occurrences
def count(dict, unique_value):
count = 0
for k, v in d.items():
if v == unique_value:
count =1
return count
for value in unique_values:
occurrences = count(d, value)
unique_count.append( (value, occurrences) )
# Find which value has most occurences
for occurrence in unique_count:
if occurrence[1] > most_occurence:
most_occurence = occurrence[0]
# Create new dict with keys of most occurred value
for k, v in d.items():
if v == most_occurence:
new_d[k] = v
print(new_d)
Nothing fancy, but direct to the point. There should be many ways to optimize this.
Output: {1: 12, 2: 12, 4: 12, 5: 12, 6: 12, 7: 12}