Home > Net >  List comprehension in relation to biology
List comprehension in relation to biology

Time:11-10

I'm trying to write a function using list comprehension about open reading frame using a dictionary of only the stop codon. The program takes in three letter at a time and if that three letter is one of that stop codon, the program stops and counts the number of letters (the stop codon is NOT counted, nor is anything afterwards).

For example, nextStop2('AAAAAAAGTGGGTGCTAGGTTGGC') should return 15. Here is what I have so far but python keeps telling me syntax error. Can anyone give me any advice on how to improve?

def nextStop2(Seq):
    GeneticCodeStop = {'TAA':'X', 'TAG':'X', 'TGA':'X'}
    seq2 = ''.join(i if GeneticCodeStop[Seq[i:i 3]]!='X' else end_of_loop() 
                   for i in range(0,len(Seq),3))
    return len(seq2)

A correct code using simply for loop would be below (provided by diggusbickus). I tried to convert it into list comprehension but wasn't sure about the syntax.

def nextStop2(Seq):
    GeneticCodeStop = ['TAA', 'TAG', 'TGA']
    seq2=''
    for i in range(0,len(Seq),3) :
        codon=Seq[i:i 3]
        if codon in GeneticCodeStop:
            break
        seq2 =codon
    return len(seq2)

CodePudding user response:

itertools.takewhile (read comments bottom up):

''.join(  # joining them into a single string
    itertools.takewhile(
        lambda x: x not in GeneticCodeStop, # until a stop codon found
        (Seq[i:i 3] for i in range(0, len(Seq), 3))  # iterate codons
    )
)

CodePudding user response:

Building on Marat's answer. Returns the length of the string without allocating space for a copy:

def nextStop2(Seq, GeneticCodeStop=['TAA', 'TAG', 'TGA']):
    return sum(3 for _ in
        itertools.takewhile(
            lambda x: x not in GeneticCodeStop,
            (Seq[i:i 3] for i in range(0, len(Seq), 3))
        ))
  • Related