python - sum values between unique pairs in ledger using hash tables or dictionaries-CodePudding

I'm looking at a big list of transactions, and want to simply summarize the total value sent between one account to another

Input:

sources = ['A','A','A','A','A','B','B','B','B']
targets = ['C','C','C','D','D','C','C','D','D']
values =  [ 2 , 1 , 2 , 2 , 3 , 2 , 3 , 2 , 3 ]

Output:

sources = ['A','A','B','B']
targets = ['C','D','C','D']
totals =  [ 5 , 5 , 5 , 5 ]

This is how I did it using indexed for loops, but I'm looking to understand how it would work with hash tables or dictionaries:

#Create a list of unique pairs
pairs = [[sources[0],targets[0]]]
for idx, x in enumerate(sources):
    temp_pair = [sources[idx],targets[idx]]
    new_pair = True
    for pair in pairs:
        if temp_pair == pair:
            new_pair = False
    if new_pair == True:
        pairs.append(temp_pair)
print(pairs)

#Define an empty totals list based on pairs list
totals = []
for pair in pairs:
    totals.append([pair[0],pair[1],0])
print(totals)

# Fill the totals list with values
for idx, x in enumerate(sources):
    for idy, pair in enumerate(pairs):
        if [pair[0],pair[1]] == [sources[idx],targets[idx]]:
            totals[idy][2]  = values[idx]
print(totals)

With these results:

Thanks!

CodePudding user response：

d = dict.fromkeys(zip(sources, targets), 0)
for s, t, v in zip(sources, targets, values): d[(s, t)]  = v
d
# {('A', 'C'): 5, ('A', 'D'): 5, ('B', 'C'): 5, ('B', 'D'): 5}

CodePudding user response：

The answer by d.b is good, but can be improved by using defaultdict.

from collections import defaultdict

sources = ['A','A','A','A','A','B','B','B','B']
targets = ['C','C','C','D','D','C','C','D','D']
values =  [ 2 , 1 , 2 , 2 , 3 , 2 , 3 , 2 , 3 ]

d = defaultdict(int)

for s, t, v in zip(sources, targets, values): 
    d[(s, t)]  = v

Now d is:

defaultdict(<class 'int'>, {('A', 'C'): 5, ('A', 'D'): 5, ('B', 'C'): 5, ('B', 'D'): 5})