dictionary = {1: ['A', 'U'],
2: ['C', 'G'],
3: ['G', 'C'],
4: ['T', 'A']}
def transcribe(S):
"""Converts a single-character c from DNA
nucleotide to its complementary RNA nucleotide
"""
if S =='':
return ''
for i in dictionary:
S = S.replace(dictionary[i][0], dictionary[i][1])
return S
Above is my code so far. Below are the tests I am running.
print("Function 6 Tests")
print( "transcribe('ACGTTGCA') should be 'UGCAACGU' :", transcribe('ACGTTGCA') )
print( "transcribe('ACG TGCA') should be 'UGCACGU' :", transcribe('ACG TGCA') ) # Note that the space disappears
print( "transcribe('GATTACA') should be 'CUAAUGU' :", transcribe('GATTACA') )
print( "transcribe('cs5') should be '' :", transcribe('cs5') ) # Note that other characters disappear
print( "transcribe('') should be '' :", transcribe('') ) # Empty strings!
Function 6 Tests
transcribe('ACGTTGCA') should be 'UGCAACGU' : UCCAACCU
transcribe('ACG TGCA') should be 'UGCACGU' : UCC ACCU
transcribe('GATTACA') should be 'CUAAUGU' : CUAAUCU
transcribe('cs5') should be '' : cs5
transcribe('') should be '' :
Above are the results I am getting.
1)I don't understand why C will not convert into G even though I listed it in the dictionary. 2)Is there a way to modify the first if statement so that anything else other than ATCG entered will result in '' being printed? 3) Also, how do I get rid of the space between ACG and TGCA?
Thank you very much
Thank you very much.
CodePudding user response:
Consider:
>>> a = "hello"
>>> a = a.replace('l', 'x')
>>> a
'hexxo'
>>> a = a.replace('x', 'l')
>>> a
'hello'
>>>
You have an entry that converts C to G, but then you have an entry that converts G back to C.
Try having a dictionary that maps a character to the character to replace with:
d = {'A': 'U', 'C': 'G', 'G': 'C', 'T': 'A'}
Now you can do something like the following, where you only convert each character once.
>>> d = {'A': 'U', 'C': 'G', 'G': 'C', 'T': 'A'}
>>> d
{'A': 'U', 'C': 'G', 'T': 'A', 'G': 'C'}
>>> ''.join(d[ch] for ch in "ACTG")
'UGAC'
>>>
This assumes that the string you're working on only contains A, C, G, or T.
CodePudding user response:
replace
replaces all instances. The problem is for ACGTTGCA, there are 2 Cs, so once you replace C by G, you replace the already replaced G by C again.
Make dictionary
a mapping from letters in S
to the replacement letters. Then simply use it in a loop to replace letters
# make the dictionary that maps the first list element to the second
d = {k:v for k,v in dictionary.values()}
def transcribe(S):
"""Converts a single-character c from DNA
nucleotide to its complementary RNA nucleotide
"""
if S =='':
return ''
# get dict values from S
return ''.join(d.get(k, '') for k in S)
CodePudding user response:
Maybe it's worth considering moving to an intermediate alphabet:
from typing import Dict, Final
DNA_2_RNA: Final[Dict[str, str]] = {
"A": "1",
"C": "2",
"G": "3",
"T": "4"
}
def transcribe(dna: str) -> str: # rna
temp = ""
for nucleotide in dna:
temp = DNA_2_RNA[nucleotide]
return (
temp
.replace("1", "U")
.replace("2", "G")
.replace("3", "C")
.replace("4", "A")
)