I was trying to find the occurrence of every 2 consecutive characters from a string.
The result will be in a dictionary
as key = 2 characters
and value = number of occurrence
.
I tried the following :
seq = "AXXTAGXXXTA"
d = {seq[i:i 2]:seq.count(seq[i:i 2]) for i in range(0, len(seq)-1)}
The problem is that the result of XX
should be 3
not 2
.
CodePudding user response:
You can use collections.Counter
.
from collections import Counter
seq = "AXXTAGXXXTA"
Counter((seq[i:i 2] for i in range(len(seq)-1)))
Output:
Counter({'AX': 1, 'XX': 3, 'XT': 2, 'TA': 2, 'AG': 1, 'GX': 1})
Or without additional libraries. You can use dict.setdefault
.
seq = "AXXTAGXXXTA"
d = {}
for i in range(len(seq)-1):
key = seq[i:i 2]
d[key] = d.setdefault(key, 0) 1
print(d)