hi was solving this question where i had to return how long the longest common subsequence(lcs) is and i was able to form a recursive function that does the same but now i wonder how to make this function return what the lcs is.
like for example
seq1 = 'serendipitous'
seq2 = 'precipitation'
lcs of seq1 and seq2 is - reipito
i want to make a function that return 'reipito'
the function that i wrote to return the length of lcs is
def lcs_recursive(seq1, seq2, idx1 = 0, idx2 = 0):
if idx1 == len(seq1) or idx2 == len(seq2):
return 0
if seq1[idx1] == seq2[idx2]:
return 1 lcs_recursive(seq1, seq2, idx1 1, idx2 1)
else:
option1 = lcs_recursive(seq1, seq2, idx1 1, idx2)
option2 = lcs_recursive(seq1, seq2, idx1, idx2 1)
return max(option1, option2)
this function returns the length how lcs
i tired to do this but failed. please help
CodePudding user response:
To convert posted code to return the LCS rather than the length.
def lcs_recursive(seq1, seq2, idx1 = 0, idx2 = 0):
if idx1 == len(seq1) or idx2 == len(seq2):
return ""
if seq1[idx1] == seq2[idx2]:
return seq1[idx1] lcs_recursive(seq1, seq2, idx1 1, idx2 1)
else:
option1 = lcs_recursive(seq1, seq2, idx1 1, idx2)
option2 = lcs_recursive(seq1, seq2, idx1, idx2 1)
return max(option1, option2, key = len)
Test
seq1 = 'serendipitous'
seq2 = 'precipitation'
print(lcs_recursive(seq1, seq2))
# Output: 'reipito'
CodePudding user response:
With minimal changes to your code:
def lcs_recursive(seq1, seq2, idx1 = 0, idx2 = 0, lcs = ''):
if idx1 == len(seq1) or idx2 == len(seq2):
return lcs
if seq1[idx1] == seq2[idx2]:
return lcs_recursive(seq1, seq2, idx1 1, idx2 1, lcs seq1[idx1])
else:
option1 = lcs_recursive(seq1, seq2, idx1 1, idx2, lcs)
option2 = lcs_recursive(seq1, seq2, idx1, idx2 1, lcs)
return max(option1, option2, key=len)
seq1 = 'serendipitous'
seq2 = 'precipitation'
print(lcs_recursive(seq1, seq2)) # reipito
Since you're likely calling lcs_recursive
with the same index pairs multiple times, functools.cache
if using Python >= 3.9 or functools.lru_cache
if using Python >= 3.2 may improve performance.
For example, if seq1 = 'ab'
and seq2 = 'bd'
, lcs_recursive('ab', 'bd')
will eventually call lcs_recursive('ab', 'bd', 2, 1)
twice (you can see this by doing print(idx1, idx2)
first thing in your function). With the @lru_cache
optimization bellow, on the first call to (idx1=2, idx2=1)
(i.e. lcs_recursive('ab', 'bd', 2, 1)
), the result corresponding to (idx1=2, idx2=1)
is computed and stored (in a cache, which you can think of as a dictionary). On subsequent calls to the same index pair ((idx1=2, idx2=1)
), the stored result is looked up so that it doesn't have to be computed all over again. In general, lru_cache
will memoize results based on the input. That's why in the code below, the function helper
only takes idx1
and idx2
- to make it easier for lru_cache
to work well.
from functools import lru_cache
def lcs_recursive(seq1, seq2):
@lru_cache
def helper(idx1 = 0, idx2 = 0):
if idx1 == len(seq1) or idx2 == len(seq2):
return ''
if seq1[idx1] == seq2[idx2]:
return seq1[idx1] helper(idx1 1, idx2 1)
option1 = helper(idx1 1, idx2)
option2 = helper(idx1, idx2 1)
return max(option1, option2, key=len)
return helper()
seq1 = 'serendipitous'
seq2 = 'precipitation'
print(lcs_recursive(seq1, seq2)) # reipito
As a comparison, with seq1 = 'serendipitous'
and seq2 = 'precipitation'
lcs_recursive_uncached(seq1, seq2)
makes 1119564 recursive calls, whereas lcs_recursive_cached(seq1, seq2)
makes just 184 recursive calls!:
number_of_calls_uncached = 0
def lcs_recursive_uncached(seq1, seq2, idx1 = 0, idx2 = 0, lcs = ''):
global number_of_calls_uncached
number_of_calls_uncached = 1
if idx1 == len(seq1) or idx2 == len(seq2):
return lcs
if seq1[idx1] == seq2[idx2]:
return lcs_recursive_uncached(seq1, seq2, idx1 1, idx2 1, lcs seq1[idx1])
else:
option1 = lcs_recursive_uncached(seq1, seq2, idx1 1, idx2, lcs)
option2 = lcs_recursive_uncached(seq1, seq2, idx1, idx2 1, lcs)
return max(option1, option2, key=len)
seq1 = 'serendipitous'
seq2 = 'precipitation'
print(number_of_calls_uncached) # 0
print(lcs_recursive_uncached(seq1, seq2)) # reipito
print(number_of_calls_uncached) # 1119564
from functools import lru_cache
number_of_calls_cached = 0
def lcs_recursive_cached(seq1, seq2):
@lru_cache
def helper(idx1 = 0, idx2 = 0):
global number_of_calls_cached
number_of_calls_cached = 1
if idx1 == len(seq1) or idx2 == len(seq2):
return ''
if seq1[idx1] == seq2[idx2]:
return seq1[idx1] helper(idx1 1, idx2 1)
option1 = helper(idx1 1, idx2)
option2 = helper(idx1, idx2 1)
return max(option1, option2, key=len)
return helper()
seq1 = 'serendipitous'
seq2 = 'precipitation'
print(number_of_calls_cached) # 0
print(lcs_recursive_cached(seq1, seq2)) # reipito
print(number_of_calls_cached) # 184