Home > Mobile >  using for loop to replace bad nucleotides from DNA sequence
using for loop to replace bad nucleotides from DNA sequence

Time:12-29

I have a list of sequences (for simplicity like the following one)

seqList=["ACCTGCCSSSTTTCCT","ACCTGCCFFFTTTCCT"]

and I want to use for looping to replace every instance of a nucleotide other than ["A","C","G","T"] with "N"

my code so far

seqList=["ACCTGCCSSSTTTCCT","ACCTGCCFFFTTTCCT"]
for x in range(len(seqList)):
    for i in range(len(seqList[x])):
        if seqList[x][i] not in ["A","C","G","T"]:
            seqList[x][i].replace(seqList[x][i],"N")
            print(seqList)

problem is, the nucleotides are not replaced and nothing changes in the original sequence and i can't figure out the reason!!!

CodePudding user response:

Strings in python are immutable. You can make ot work like this

seqList=  ["ACCTGCCSSSTTTCCT","ACCTGCCFFFTTTCCT"]
for x in range(len(seqList)):
    stringl=list(seqList[x])
    for i in range(len(seqList[x])):
        if seqList[x][i] not in ["A","C","G","T"]:
            stringl[i]="N"
    seqList[x]="".join(stringl)

CodePudding user response:

An aprouch without looping all letters would be replacing all letters which are not ACGT

def replace_bad(seq):
    unique = [
        letter
        for letter in set(seq)
        if letter not in "ACGT"
    ]

    for each in unique:
        seq = seq.replace(each, "N")

    return seq


if __name__ == '__main__':
    for seq in seqList:
        print(replace_bad(seq))
  • Related