I'm working on this programming problem. The code is supposed to return clumps like this example,
Input: CGGACTCGACAGATGTGAAGAACGACAATGTGAAGACTCGACACGACAGAGTGAAGAGAAGAGGAAACATTGTAA 5 50 4
Output: CGACA GAAGA
Here is the code that I've used:
def frequency_table(text, kmer_len):
freq_map = {}
nt = len(text)
nk = kmer_len
for i in range(0, nt-nk):
pattern = text[i : i nk]
if not freq_map.get(pattern):
freq_map[pattern] = 1
else:
freq_map[pattern] = freq_map[pattern] 1
return freq_map
def FindClumps(Text, k, L, t):
Patterns = []
n = len(Text)
for i in range(n - L):
Window = str(Text[i:L])
freqMap = list(frequency_table(Window, k))
for s in range(len(freqMap)):
if len(freqMap[s]) >= t:
Patterns.append(freqMap[s])
return Patterns
Every time I submit the answer, It's said that I'm wrong.
Is there a problem with my code? Or is there an underlying concept that I don't understand?
CodePudding user response:
I found a couple mistakes. Below is the corrected code with inline comments explaining the changes.
def frequency_table(text, kmer_len):
freq_map = {}
nt = len(text)
nk = kmer_len
for i in range(0, nt-nk 1): # Range short by one
pattern = text[i : i nk]
if not freq_map.get(pattern):
freq_map[pattern] = 1
else:
freq_map[pattern] = freq_map[pattern] 1
return freq_map
def FindClumps(Text, k, L, t):
Patterns = []
n = len(Text)
for i in range(n - L):
Window = str(Text[i : i L]) # End range needs to be in relation to "i"
freqMap = frequency_table(Window, k)
# You can interate dictionary keys. Previously, you converted the
# dictionary to a list (which removed your values) in order to iterate
# over it. Your conditional then became nonsensical.
for s in freqMap:
if freqMap[s] >= t:
Patterns.append(s)
return list(set(Patterns)) # limit output to unique values only
print(FindClumps("CGGACTCGACAGATGTGAAGAACGACAATGTGAAGACTCGACACGACAGAGTGAAGAGAAGAGGAAACATTGTAA", 5, 50, 4))
Output
['GAAGA', 'CGACA']