how to count duplicates in a list of tuples and append it as a new value-CodePudding

output = [('studentA','ISDF'), ('studentB','CSE'),('studentC','BIO'),('studentA','ISDF'), ('studentB','CSE'),('studentC','BIO'),('studentA','ISDF'), ('studentB','CSE'),('studentC','BIO'),('studentA','ISDF'), ('studentB','CSE'),('studentC','BIO'),('studentA','ISDF'), ('studentB','CSE'),('studentC','BIO'),('studentA','ISDF'), ('studentB','CSE'),('studentC','BIO')]

so here there are total of 6 set of ('studentA','ISDF'), ('studentB','CSE'),('studentC','BIO') in this above list

so Im expecting an output like this ~

expected_output = [('studentA','ISDF',6), ('studentB','CSE',6),('studentC','BIO',6)]

The format should be [('student', 'department', total count)]

CodePudding user response：

You could use Counter:

from collections import Counter

output = [('studentA', 'ISDF'), ('studentB', 'CSE'), ('studentC', 'BIO'),
          ('studentA', 'ISDF'), ('studentB', 'CSE'), ('studentC', 'BIO'),
          ('studentA', 'ISDF'), ('studentB', 'CSE'), ('studentC', 'BIO'),
          ('studentA', 'ISDF'), ('studentB', 'CSE'), ('studentC', 'BIO'),
          ('studentA', 'ISDF'), ('studentB', 'CSE'), ('studentC', 'BIO'),
          ('studentA', 'ISDF'), ('studentB', 'CSE'), ('studentC', 'BIO')]

counts = Counter(output)
print(counts)
print([k   (v, ) for k, v in counts.items()])

Out:

Counter({('studentA', 'ISDF'): 6, ('studentB', 'CSE'): 6, ('studentC', 'BIO'): 6})
[('studentA', 'ISDF', 6), ('studentB', 'CSE', 6), ('studentC', 'BIO', 6)]

CodePudding user response：

Using Counter() from collections and a list comprehension

from collections import Counter

count = Counter(output)
exp_output = [(key[0], key[1], value) for key, value in count.items()]
print(exp_output)

[('studentA', 'ISDF', 6), ('studentB', 'CSE', 6), ('studentC', 'BIO', 6)]

CodePudding user response：

Try this

sorted([tuple(list(item) [output.count(item)]) for item in set(output)])

sorted([item (output.count(item),) for item in set(output)])

CodePudding user response：

output = [('studentA','ISDF'), ('studentB','CSE'),('studentC','BIO'), 
    ('studentA','ISDF'), ('studentB','CSE'),('studentC','BIO'), 
    ('studentA','ISDF'), ('studentB','CSE'),('studentC','BIO'), 
    ('studentA','ISDF'), ('studentB','CSE'),('studentC','BIO'), 
    ('studentA','ISDF'), ('studentB','CSE'),('studentC','BIO'), 
    ('studentA','ISDF'), ('studentB','CSE'),('studentC','BIO')]
keys = list(set(output))
[k   (output.count(k),) for k in keys]

CodePudding user response：

Using collection.Counter is ideal for this. However, if for some reason you want to avoid an import you could do it like this:

d = dict()

for e in output:
    d[e] = d.get(e, 0)   1

print([(a, b, v) for (a, b), v in d.items()])

Output:

[('studentA', 'ISDF', 6), ('studentB', 'CSE', 6), ('studentC', 'BIO', 6)]