I'm getting a bit more used to python now, and my professor has started teaching us about lists and string manipulation, list slicing and all of that. He proposed an exercise which I'm pretty sure can be solved by just using list slices and should not need super specific inbuilt python functions, but I have no clue as in how to start thinking about this problem. It goes as follows:
"Create a function, lets call it "generate_n_gaps", that has as parameters a string 'dna' of any length that includes any combination of the characters in the string DNA = 'ATCG' (just like the dna sequencing we see in biology) and a gap that we will denote as GAP = '_' , and an integer parameter 'n'. The function must return a list with all variations of 'dna' containing up to n extra gaps without repetition."
Here are a couple of examples:
In [1]: generate_gaps( 'T', 2 )
Out[1]: ['T', '_T', 'T_', '__T', '_T_', 'T__']
In [2]: generate_gaps( 'CA', 2 )
Out[2]: ['CA', '_CA', 'C_A', 'CA_', '__CA', '_C_A', '_CA_', 'C__A', 'C_A_', 'CA__']
In [3]: generate_gaps( 'C_A', 2)
Out[3]: ['C_A', '_C_A', 'C__A', 'C_A_', '__C_A', '_C__A', '_C_A_', 'C___A', 'C__A_', 'C_A__']
The function should be defined as follows:
def generate_n_gaps( dna, n = 1 ):
As requested, I have worked on the problem a bit and have written a code that manages to generate the anagrams needed. I managed to write another function that generates only one gap, and used that for the main one.
def generate_n_gaps( dna, n = 1 ):
last=generate_gaps(dna)
a=len(last)
for i in range(a):
b=generate_gaps(last[i])
last.append(b)
return last
def generate_gaps( dna ):
comb=[]
for i in range(0 , len(dna) 1):
partial=''
partial=dna[:i] GAP dna[i:]
comb.append(partial)
last=[]
for i in comb:
if i not in last:
last.append(i)
return last
This manages to get me the anagrams I need, but the list my function returns is a bit messy, how would I go about cleaning it up? By that, I mean, is there any way to 'combine' the lists, removing the other lists inside the main list? I reckon if I manage to do that, removing the duplicates is not really an issue.
This is what my function returns for the first example:
In [1]: generate_gaps( 'T', 2 )
Out[1]: ['_T', 'T_', ['__T', '_T_'], ['_T_', 'T__']]
CodePudding user response:
this will flatten your list of list. hope it helps.
l = ['_T', 'T_', ['__T', '_T_'], ['_T_', 'T__']]
flat_l = []
for item in l:
if isinstance(item, str):
flat_l.append(item)
else:
for j in item:
flat_l.append(j)
print(flat_l)
#['_T', 'T_', '__T', '_T_', '_T_', 'T__']