Home > OS >  DNA to Protein Python Function
DNA to Protein Python Function

Time:10-17

I'm a huge Python noobie trying to finish my code of translating DNA to RNA to Protein - it should start printing proteins once the 'Met' protein is found, and stop printing once the 'STOP' proteins are found, and I want it to return a list of strings. But somehow it ONLY prints ['Met'] when this DNA string is included? I wonder where I'm going wrong...

e.g. translate('AAATACGTATTA') should return ['Met', 'His', 'Asn'].

def transcribe(str):

    dna = 'ATGC'
    rna = 'UACG'
    transcription = str.maketrans(dna, rna)
    return (str.translate(transcribe))

def translate(dna):

    codon_list = {'UUU':'Phe','UUC':'Phe','UUA':'Leu','UUG':'Leu','CUU':'Leu','CUC':'Leu',
       'CUA':'Leu','CUG':'Leu','AUU':'Ile','AUC':'Ile','AUA':'Ile','AUG':'Met',
       'GUU':'Val','GUC':'Val','GUA':'Val','GUG':'Val','UCU':'Ser','UCC':'Ser',
       'UCA':'Ser','UCG':'Ser','CCU':'Pro','CCC':'Pro','CCA':'Pro','CCG':'Pro',
       'ACU':'Thr','ACC':'Thr','ACA':'Thr','ACG':'Thr','GCU':'Ala','GCC':'Ala',
       'GCA':'Ala','GCG':'Ala','UAU':'Tyr','UAC':'Tyr','UAA':'STOP','UAG':'STOP',
       'CAU':'His','CAC':'His','CAA':'Gln','CAG':'Gln','AAU':'Asn','AAC':'Asn',
       'AAA':'Lys','AAG':'Lys','GAU':'Asp','GAC':'Asp','GAA':'Glu','GAG':'Glu',
       'UGU':'Cys','UGC':'Cys','UGA':'STOP','UGG':'Trp','CGU':'Arg','CGC':'Arg',
       'CGA':'Arg','CGG':'Arg','AGU':'Ser','AGC':'Ser','AGA':'Arg','AGG':'Arg',
       'GGU':'Gly','GGC':'Gly','GGA':'Gly','GGG':'Gly'}

    rna = translate(dna) 
    sequence_num = len(rna) 
    protein_start = 0

    for i in range(sequence_num):
        if(rna[i:i 3] == "AUG"):
            protein_start = i
            break

    protein = []
    for i in range(protein_start, sequence_num, 3):
        codon = codon_list[rna[i:i 3]]
        if (codon == "STOP"):
            break
        elif (codon == "Met"):
            protein.append(codon)
            return protein

Thank you :)

CodePudding user response:

    for i in range(protein_start, sequence_num, 3):
        codon = codon_list[rna[i:i 3]]
        if (codon == "STOP"):
            return protein // or a break statement if more code
        else:
            protein.append(codon)

CodePudding user response:

You only print when it is condon == 'Met', you can do a small cheat and just and a check statment.

passed_Met = false
for i in range(protein_start, sequence_num, 3):
    codon = codon_list[rna[i:i 3]]
    if (codon == "STOP"):
        return protein // Returns the proten when 'STOP' is found

    elif (passed_Met):
        protein.append(codon)

    elif (codon == "Met"):
        protein.append(codon)
        passed_Met = true

CodePudding user response:

I think the main missing part here is in the end you need to convert your elif statement to an else statement which covers all other aminoacids. In your current code you only append when there is a Met in the sequence.

In your first function you might want to change:

def transcribe(str):

    dna = 'ATGC'
    rna = 'UACG'
    transcription = str.maketrans(dna, rna)
    return (str.translate(transcription))

and as a second point you may also want to change

rna = translate(dna) to rna = transcribe(dna)
  • Related