I seek operating a list of lists as follows:
# key
example = [[2, 2, -10, 'Yes'],
[2, 8, 21, 'Yes'],
[2, 19, 14, 'Non'],
[2, 30, -22, 'Non'],
[4, -15, 31, 'Yes'],
[4, 2, 17, 'Yes'],
[4, 3, -90, 'Non']]
# What I have tried through dictionaries
dictsum = dict()
for i in example:
if i[0] in dictsum.keys():
if i[3] == "Yes":
dictsum[i[0]][0] = dictsum[i[0]][0] i[1]
dictsum[i[0]][1] = dictsum[i[0]][1] i[2]
else:
dictsum[i[0]] = [i[1], i[2]]
onlysum = [[k,v[0], v[1]] for k,v in dictsum.items()]
As noted, all rows 'i[1]' and 'i[2]' have been added, respecting their key 'i[0]' and a conditional 'i[3]'.
What I try to do is find the value furthest from zero if i[3] == "Non" and add that value respecting its corresponding key 'i[0]'
Something should be like this:
result = [[2, 10 30, 11 (-22)],
[4, -13 3, 48 (-90)]]
# Finally
result = [[2, 40, -11],
[4, -10, -42]]
I clarify that this is an example raised by myself to understand how the lists are operated in these cases, I am not an expert. If someone knows a way to give you a solution and feedback, I appreciate that you share it, cordial greetings.
CodePudding user response:
You can do it like this:
dictsum = collections.defaultdict(lambda: [0, 0])
max_non_values = collections.defaultdict(lambda: [0, 0])
def special_max(x, y):
if abs(x) < abs(y):
return y
elif abs(x) > abs(y):
return x
else:
return max(x, y)
for key, val_1, val_2, include in example:
if include == "Yes":
dictsum[key][0] = dictsum[key][0] val_1
dictsum[key][1] = dictsum[key][1] val_2
else:
max_non_values[key][0] = special_max(max_non_values[key][0], val_1)
max_non_values[key][1] = special_max(max_non_values[key][1], val_2)
onlysum = [[k, dictsum[k][0] max_non_values[k][0], dictsum[k][1] max_non_values[k][1]] for k in dictsum]
Since you're taking a sum, using a defaultdict is faster and simpler than checking for the key in the dict; you also either need to make two passes, or use two dicts. Also, I would recommend making the values in "example" either a class, or at the very least a namedtuple. Collecting data as a heterogeneous list of unstructured integers and strings becomes unmanageable beyond a few lines of code.
CodePudding user response:
- First collect the sums for all the "Yes" values:
dictsum = {i[0]: [sum(x[1] for x in example if x[0]==i[0] and x[-1]=="Yes"),
sum(x[2] for x in example if x[0]==i[0] and x[-1]=="Yes")] for i in example}
>>> dictsum
{2: [10, 11], 4: [-13, 48]}
- Update the 0th and 1st list elements for each key using
max
with the requiredkey
function:
output = {k: [v[0] max([l[1] for l in example if l[0]==k and l[-1]=="Non"], key=abs), \
v[1] max([l[2] for l in example if l[0]==k and l[-1]=="Non"], key=abs)] \
for k,v in dictsum.items()}
>>> output
{2: [40, -11], 4: [-10, -42]}