Home > Net >  Trying to create a sliding window that checks for repeats in a DNA sequence
Trying to create a sliding window that checks for repeats in a DNA sequence

Time:12-03

I'm trying to write a bioinformatics code that will check for certain repeats in a given string of nucleotides. The user inputs a certain patter, and the program outputs how many times something is repeated, or even highlights where they are. I've gotten a good start on it, but could use some help.

Below is my code so far.

while True:
    text = 'AGACGCCTGGGAACTGCGGCCGCGGGCTCGCGCTCCTCGCCAGGCCCTGCCGCCGGGCTGCCATCCTTGCCCTGCCATGTCTCGCCGGAAGCCTGCGTCGGGCGGCCTCGCTGCCTCCAGCTCAGCCCCTGCGAGGCAAGCGGTTTTGAGCCGATTCTTCCAGTCTACGGGAAGCCTGAAATCCACCTCCTCCTCCACAGGTGCAGCCGACCAGGTGGACCCTGGCGCTgcagcggctgcagcggccgcagcggccgcagcgCCCCCAGCGCCCCCAGCTCCCGCCTTCCCGCCCCAGCTGCCGCCGCACATA'
    print ("Input Pattern:")
    pattern = input("")


    def pattern_count(text, pattern):
        count = 0
        for i in range(len(text) - len(pattern)   1):
            if text[i: i   len(pattern)] == pattern:
                count = count   1
            return count


    print(pattern_count(text, pattern))

The issue lies in in the fact that I can only put the input from the beginning (ex. AGA or AGAC) to get an output. Any help or recommendations would be greatly appreciated. Thank you so much!

CodePudding user response:

Here is a modified version of your code that will allow the user to input a string of nucleotides and a pattern to search for. It will then output the number of times the pattern appears in the string. Note that this code is case sensitive, so "AGC" and "agc" will be treated as different patterns.

def pattern_count(text, pattern):
    count = 0
    for i in range(len(text) - len(pattern)   1):
        if text[i: i   len(pattern)] == pattern:
            count = count   1
    return count

while True:
    print("Input the string of nucleotides:")
    text = input()

    print("Input the pattern to search for:")
    pattern = input()

    count = pattern_count(text, pattern)
    print("The pattern appears {} times in the string.".format(count))

One potential optimization you could make to your code is to use the built-in count() method to count the number of times a pattern appears in a string. This would avoid the need to loop over the string and check each substring manually. Here is how you could modify your code to use this method:

def pattern_count(text, pattern):
    return text.count(pattern)

while True:
    print("Input the string of nucleotides:")
    text = input()

    print("Input the pattern to search for:")
    pattern = input()

    count = pattern_count(text, pattern)
    print("The pattern appears {} times in the string.".format(count))
  • Related