two commands in comprehension-CodePudding

Sorry if the title is inaccurate but I'm unsure how to title this as I am kind of a rookie.

I am trying to solve a challenge:

"Complete the function scramble(str1, str2) that returns True if a portion of s1 characters can be rearranged to match s2, otherwise returns False."

So first i wanted to complete it like this:

return True if "".join(str(e) for e in s2 if e in s1 and s1.count(e) >= s2.count(e)) == s2 else False

However, this times out and apparently is not sufficiently optimized.

Oblivious as I am I assume it has something to do with the count() method so instead I am trying to remove e from s1 below if e is in s1 - because when a character from s2 is found in s1 it can not be used again (unless more of same character exist in s1. So for example s2 = 'stress' would need 3 x 's' occurences in s1 for the function to return True.

however, in the code below it seem like this part: s1.replace(e, '', 1) does not do anything when I print before/after.

def scramble(s1, s2):
    print(s1)
    ("".join(str(e), s1.replace(e, '', 1)) for e in s2 if e in s1)
    print(s1)
scramble('abcdefghijklmnopqrstuvzxy', 'stress')

What can I do?

CodePudding user response：

Your algorithm iterates over each character in s2, and then finds the character count of that character for s1 and s2. For ease of notation, let m = len(s1) and n = len(s2). Then, each .count() operation takes linear time in the number of characters in the string invoking .count(). So, we have O(n) iterations, and O(m n) work per iteration -- our final runtime is O(n*m n^2). If s2 is really long, we're going to see a lot of slowdown.

We can do better, by generating the counts before beginning to analyze them.

# You can use collections.Counter() here, but I've chosen to forgo this to make the runtime analysis clearer.
def count_characters(s):
    chars = {}
    for char in s:
        if char in chars:
            chars[char]  = 1
        else:
            chars[char] = 1
    return chars

def scramble2(s1, s2):
    s1_characters = count_characters(s1)
    s2_characters = count_characters(s2)
   
    for char in s2_characters:
        if char not in s1_characters or s1_characters[char] < s2_characters[char]:
            return False
    
    return True

Let's analyze the runtime of the revised algorithm:

count_characters() is linear in the length of the input string. So the character counters take O(m n) time to generate. Then, we need to examine each counter in s2_characters, and compare it with a counter in s1_characters. Regardless of whether you assume the alphabet is constant size, this takes O(n) time in the worst case. So, the final runtime is O(m n) -- i.e. linear in the lengths of the input strings.