Home > Software engineering >  Trying to get spaces between codons and stop the generation when reaching a certain codon for RNA to
Trying to get spaces between codons and stop the generation when reaching a certain codon for RNA to

Time:02-11

Here's some things I need help with.
But first of all, please let me pull up the code first.

from base_printer import *

def dna_complement(dna):
  coup = ""
  for letter in dna:
    if letter == "C":
      coup  = "G"
    if letter == "G":
      coup  = "C"
    if letter == "A":
      coup  = "T"
    if letter == "T":
      coup  = "A"
  return coup
  

def convert_to_rna(dna):
  coup2 = ""
  for letter in dna:
    if letter == "C":
      coup2  = "G"
    if letter == "G":
      coup2  = "C"
    if letter == "A":
      coup2  = "U"
    if letter == "T":
      coup2  = "A"
  return coup2

def translate(rna):
  amino_acid = ""
  for i in range(len(rna)-2):
    three_letter = rna[i:i 3]
    if three_letter in CODON_TABLE:
      amino_acid  = CODON_TABLE[three_letter]
      i  = 2 
  return amino_acid

CODON_TABLE = {'UUU':'Phe','UUC':'Phe','UUA':'Leu','UUG':'Leu','CUU':'Leu','CUC':'Leu','CUA':'Leu','CUG':'Leu','AUU':'Ile','AUC':'Ile','AUA':'Ile','AUG':'Met','GUU':'Val','GUC':'Val','GUA':'Val','GUG':'Val','UCU':'Ser','UCC':'Ser','UCA':'Ser','UCG':'Ser','CCU':'Pro','CCC':'Pro','CCA':'Pro','CCG':'Pro','ACU':'Thr','ACC':'Thr','ACA':'Thr','ACG':'Thr','GCU':'Ala','GCC':'Ala','GCA':'Ala','GCG':'Ala','UAU':'Tyr','UAC':'Tyr','UAA':'STOP','UAG':'STOP','CAU':'His','CAC':'His','CAA':'Gln','CAG':'Gln','AAU':'Asn','AAC':'Asn','AAA':'Lys','AAG':'Lys','GAU':'Asp','GAC':'Asp','GAA':'Glu','GAG':'Glu','UGU':'Cys','UGC':'Cys','UGA':'STOP','UGG':'Trp','CGU':'Arg','CGC':'Arg','CGA':'Arg','CGG':'Arg','AGU':'Ser','AGC':'Ser','AGA':'Arg','AGG':'Arg','GGU':'Gly','GGC':'Gly','GGA':'Gly','GGG':'Gly'}

dna="AAGAGATGCCATTGTCCCCCGGCCTCCTGCTGCTGCTCTTAGCGGGGCCACATCGGCCACCGCTGCCCTGCCCCTGGAGGGTGGCCCCACCGGCCGTTACAGCGAGCATAC" 


def main():
  print("\nWelcome to the DNA program: The Code of Life.")
  print("\nSample DNA strand:\n")
  print("Regular DNA:")
  print_bases(dna)
  print("DNA after complement: ")
  dna2 = dna_complement(dna)
  print_bases(dna2)
  print("DNA after RNA convertion: ")
  rna = convert_to_rna(dna)
  print_bases(rna)
  print("The result of translation: ")
  amino_acid = translate(rna)
  print_bases(amino_acid)


main()

I was trying to get the Codon printing out as everyone can see, the output of translation have been successfully printed out as

PHESERLEUSERLEUTYRTHRARGGLYVALSTOPASNTHRGLNARGGLYGLYGLYGLYALAPROARGGLYGLUARGGLYASPTHRARGASPTHRARGASPTHRARGGLUARGGLUASNILESERARGALAPROPROPROARGGLYVALCYSVALSTOPSERALAPROARGGLYVALTRPGLYALAARGASPTHRARGGLYGLYASPTHRARGGLYGLYGLYASPTHRPROLEUSERPROPROHISTHRPROARGGLYGLYGLYVALTRPGLYALAPROARGGLYALAGLNASNMETCYSVALSERARGALALEUSERARGVALTYRMET

Now what I want to do is to have spaces separating each of the Codons listed in the program outputs and stop the generation once it reached UAA/UGA/UAG which is what I'm trying to figure out. But now the problem is I don't know where I should even start with which is embarrassing.

An example of output I want: Phe Ser Leu Ser Leu Tyr (Stops once reaching the UAA/UGA/UAG)

Would someone be willing to offer some tips?

Edit: Here's the information of base_print , it basically prints the output out in different colors

import colorama as cr
cr.init(autoreset=True)

def print_bases(string):
    """prints dna/rna bases with color coding for each base"""
    for char in string.upper():
        if char == "A":
            print(cr.Back.BLACK   cr.Fore.GREEN   cr.Style.BRIGHT   char, end="")
        elif char == "C":
            print(cr.Back.BLACK   cr.Fore.YELLOW   cr.Style.BRIGHT   char, end="")
        elif char == "T":
            print(cr.Back.BLACK   cr.Fore.BLUE   cr.Style.BRIGHT   char, end="")
        elif char == "G":
            print(cr.Back.BLACK   cr.Fore.MAGENTA   cr.Style.BRIGHT   char, end="")
        elif char == "U":
            print(cr.Back.BLACK   cr.Fore.WHITE   char, end="")
        else:
            print(char, end="") 
    print()

CodePudding user response:

Here is how I would do it

def translate(rna):
    amino_acid = []
    for i in range(len(rna) - 2):
        three_letter = rna[i:i   3]
        if three_letter in ['UAA', 'UGA', 'UAG']:
            break
        if three_letter in CODON_TABLE:
            amino_acid.append(CODON_TABLE[three_letter])
            i  = 2
    return ' '.join(amino_acid)

CodePudding user response:

Assuming you're trying to print everything prior to 'STOP' sliced into 3 characters each, here's an extension of your main function:

    res = amino_acid[:amino_acid.index('STOP')]  if 'STOP' in amino_acid else amino_acid
    res = ' '.join(res[i:i 3] for i in range(0, len(res), 3))
    print(res)

Alternatively, if you only want to translate until you reach a stop codon and already have you amino acid with spaces, here's a modification of your translate function:

def translate(rna):
    amino_acid = ""
    for i in range(len(rna) - 2):
        three_letter = rna[i:i   3]
        if three_letter in CODON_TABLE:
            next_aa = CODON_TABLE[three_letter]
            if next_aa == 'STOP':
                return amino_acid.strip()
            amino_acid  = ' '   next_aa
            i  = 2
    return amino_acid.strip()

Output:

Phe Ser Leu Ser Leu Tyr Thr Arg Gly Val
  • Related