# find all possible sorted substrings of s
substr = ["".join(sorted(s[i: j]))
for i in range(len(s))
for j in range(i 1, len(s) 1)]
I know the sorted()
method is O(nlog(n)), and the finding of all possible substrings is O(n^2). However, the .join()
is what is throwing me off. sorted()
returns a list and .join()
iterates over each element in the list to append it to a string. So it should be a linear operation.
Would my substring sorter therefore be O(n^2) for the nested loop * O(nlogn) for sorting each result * O(n) for joining? Ergo O(n^4logn)??
If so, would breaking up the operations make this more efficient? I had another implementation where I move the sorting of the substrings to a second list comprehension
substr = [s[i: j] for i in range(len(s))
for j in range(i 1, len(s) 1)]
sortedSubstr = ["".join(sorted(s)) for s in substr]
This would make it the O(n^2) list comprehension first O(n)*O(nlogn) for the second list comprehension
Making the overall program now O(n^2logn)
Please correct me if I am wrong. Thanks
CodePudding user response:
For the sake of analysis, let's replace the comprehensions with their equivalent loops as described in the docs.
Your first approach becomes
substr = []
for i in range(len(s)): # O(n)
for j in range(i 1, len(s) 1): # O(n)
substr.append("".join(sorted(s[i: j]))) # O(n*logn)
# Overall O(n^3 * logn)
Note that substr.append("".join(sorted(s[i: j])))
is not multiplicative, rather it is a sequential combination of the following operations (no actual assignments happen)
temp = s[i:j] # O(j-i), which worst case we can take O(n)
sorted_temp = sorted(temp) # O(n*logn)
joined_temp = "".join(sorted_temp) # O(n)
substr.append(joined_temp) # amortized O(1)
# Overall O(n*logn)
Now in the second approach the code becomes
substr = []
for i in range(len(s)): # O(n)
for j in range(i 1, len(s) 1): # O(n)
substr.append(s[i:j]) # O(n) for the slice
# first loop: O(n^3)
sortedSubstr = []
for s in substr: # O(n^2), we have appended n^2 times in the first loop
sortedSubstr.append("".join(sorted(s))) # O(n*logn)
# second loop: O(n^3 * logn)
# Overall O(n^3 * logn)
So as you can see, they should be the same time complexity. I almost missed substr
length being n^2
rather than n
, so that might be a pitfall.
References for time complexity of various operations:
CodePudding user response:
For the first algorithm time complexity is O(n^3*log(n))
, because after two loop you are not making a join
for each atomic action in sorted
. You separately sort and join. So O(n) O(n*log(n)) = O(n*log(n))
, which multiplied by O(n^2)
(nested loops) gives us O(n^3*log(n))
.
About the second algorithm.
- Calculation of
substr
gives usO(n^3)
: sameO(n^2)
for nested loops multiplied byO(n)
for slicings
. - Note that
len(substr)
isO(n^2)
— for each(i, j)
from nested loops. - Calculation of
sortedSubstr
gives usO(n^3*log(n))
: for each element ofsubstr
(whose count isO(n^2)
) we callsorted
. Each element'slen
isO(n)
, sosorted(s)
gives usO(n*log(n))
. So, samely,O(n^2) * O(n*log(n)) = O(n^3*log(n))
. - Calculation of
substr
(O(n^3)
) plus calculation ofsortedSubstr
(O(n^3*log(n))
) yieldsO(n^3*log(n))
.
So the same O(n^3*log(n))
for the second.
CodePudding user response:
In this expression:
"".join(sorted(s[i: j]))
the O(n) join
is negligible, because you're doing it once, after you've done the O(nlogn) sort
.
O(nlogn n)
= O((n 1)logn)
= O(nlogn)
Doing that sort n^2 times gets you to O(n^3 logn)
, regardless of whether/when you do the join
.