I have a list of 90 strings and then an X string. In my code I am comparing x to each of the 90 strings and counting frequency of how many matches in each string in the list. This worked at first but then I realised its printing out the cumulative frequency. I changed my code so that the counter is inside my loop but I realise that now although count is resetting its not actually going through all the list? when I print the length of the list in the end rather than 90 results i'm getting 11? and im not sure why.
iv recreated below:
results = {}
list = ['ABCD','ABBB','ACCC','AACC','ACDA','AACC','ABBB','ACCC','AGGG','ABBC','BBBA','BCCD','ABBE','ABBE', 'ACDE','ACCC']
m = 'ACEC'
for x in list:
matches = 0
mismatches =0
for pos in range(0,min(len(m),len(x))):
if m[pos]!=x[pos]:
mismatches =1
else:
matches =1
results[matches] = x
expected output: printing the dictionary with matches and strings:
1 ABCD 1 ABBB 3 ACCC etc ...
CodePudding user response:
Just use a Counter
object from collections
import collections
l = ['a', 'a', 'b', 'c', 'd', 'c']
c = collections.Counter()
c.update(l)
print(c)
Results in this:
Counter({'a': 2, 'c': 2, 'b': 1, 'd': 1})
CodePudding user response:
The problem here is that you are overwriting your dictionary. When you come across a string with the same number of matches as previous string, you set results[matches]
to that string, thus erasing the previous string stored there. The length of result
is thus the length of the number of unique numbers of matches. I can’t propose a fix as it is not entirely clear what you are trying to implement here.
CodePudding user response:
It's not completely clear what you want to achieve. Here are a couple of options: With
words = [
'ABCD', 'ABBB', 'ACCC', 'AACC', 'ACDA', 'AACC', 'ABBB', 'ACCC',
'AGGG', 'ABBC', 'BBBA', 'BCCD', 'ABBE', 'ABBE', 'ACDE', 'ACCC'
]
base = 'ACEC'
this
results = [(sum(a == b for a, b in zip(base, word)), word) for word in words]
results in
[(1, 'ABCD'), (1, 'ABBB'), (3, 'ACCC'), (2, 'AACC'), (2, 'ACDA'), (2, 'AACC'),
(1, 'ABBB'), (3, 'ACCC'), (1, 'AGGG'), (2, 'ABBC'), (0, 'BBBA'), (1, 'BCCD'),
(1, 'ABBE'), (1, 'ABBE'), (2, 'ACDE'), (3, 'ACCC')]
and this
results = {word: sum(a == b for a, b in zip(base, word)) for word in words}
results in
{'ABCD': 1, 'ABBB': 1, 'ACCC': 3, 'AACC': 2, 'ACDA': 2, 'AGGG': 1,
'ABBC': 2, 'BBBA': 0, 'BCCD': 1, 'ABBE': 1,'ACDE': 2}
and this
results = {}
for word in words:
results.setdefault(sum(a == b for a, b in zip(base, word)), []).append(word)
results in
{0: ['BBBA'],
1: ['ABCD', 'ABBB', 'ABBB', 'AGGG', 'BCCD', 'ABBE', 'ABBE'],
2: ['AACC', 'ACDA', 'AACC', 'ABBC', 'ACDE'],
3: ['ACCC', 'ACCC', 'ACCC']}