Home > Software design >  Find Longest Alphabetically Ordered Substring - Efficiently
Find Longest Alphabetically Ordered Substring - Efficiently

Time:01-30

The goal of some a piece of code I wrote is to find the longest alphabetically ordered substring within a string.

"""
Find longest alphabetically ordered substring in string s.
"""

s = 'zabcabcd' # Test string.

alphabetical_str, temp_str = s[0], s[0]
for i in range(len(s) - 1):  # Loop through string.
    if s[i] <= s[i   1]:  # Check if next character is alphabetically next.
        temp_str  = s[i   1]  # Add character to temporary string.
        if len(temp_str) > len(alphabetical_str):  # Check is temporary string is the longest string.
            alphabetical_str = temp_str  # Assign longest string.
    else:
        temp_str = s[i   1]  # Assign last checked character to temporary string.
print(alphabetical_str)

I get an output of abcd.

But the instructor says there is PEP 8 compliant way of writing this code that is 7-8 lines of code and there is a more computational efficient way of writing this code that is ~16 lines. Also that there is a way of writing this code in only 1 line 75 character!

Can anyone provide some insight on what the code would look like if it was 7-8 lines or what the most work appropriate way of writing this code would be? Also any PEP 8 compliance critique would be appreciated.

CodePudding user response:

Depending on how you choose to count, this is only 6-7 lines and PEP 8 compliant:

def longest_alphabetical_substring(s):
    sub = '', 0
    for i in range(len(s)):
        j = i   len(sub)   1
        while list(s[i:j]) == sorted(s[i:j]) and j <= len(s):
            sub, j = s[i:j], j 1
    return sub


print(longest_alphabetical_substring('zabcabcd'))

Your own code was PEP 8 compliant as far as I can tell, although it would make sense to capture code like this in a function, for easy reuse and logical grouping for improved readability.

The solution I provided here is not very efficient, as it keeps extracting copies of the best result so far. A slightly longer solution that avoids this:

def longest_alphabetical_substring(s):
    n = m = 0
    for i in range(len(s)):
        for j in range(i 1, len(s) 1):
            if j == len(s) or s[j] < s[j-1]:
                if j-i > m-n:
                    n, m = i, j
                break
    return s[n:m]


print(longest_alphabetical_substring('zabcabcd'))

There may be more efficient ways of doing this; for example you could detect that there's no need to keep looking because there is not enough room left in the string to find longer strings, and exit the outer loop sooner.

CodePudding user response:

Linear time:

s = 'zabcabcd'

longest = current = []
for c in s:
    if [c] < current[-1:]:
        current = []
    current  = c
    longest = max(longest, current, key=len)

print(''.join(longest))

Yours isn't PEP 8 compliant since it has two lines longer than 79 characters and relies on repeated string concatenation being fast. Also, it crashes if the input string is empty.

1 line 75 characters:

s='zabcabcd';t='';print(max([t:=t*(c>=t[-1:]) c for c in s]or[''],key=len))

CodePudding user response:

You can keep appending characters from the input string to a candidate list, but clear the list when the current character is lexicographically smaller than the last character in the list, and set the candidate list as the output list if it's longer than the current output list. Join the list into a string for the final output:

s = 'zabcabcdabc'
candidate = longest = []
for c in s:
    if candidate and c < candidate[-1]:
        candidate = []
    candidate.append(c)
    if len(candidate) > len(longest):
        longest = candidate
print(''.join(longest))

This outputs:

abcd
  • Related