Home > database >  What is the Time Complexity (Big-O) of a .replace() within a while loop?
What is the Time Complexity (Big-O) of a .replace() within a while loop?

Time:08-31

I was debating an optimal approach with my peer to an algorithms question about how to count the unpaired parenthesis in a string. Here's an excerpt for the problem description:

A string of brackets is considered correctly matched if every opening bracket in the string can be paired up with a later closing bracket, and vice versa. For instance, “(())()” is correctly matched, whereas “)(“ and “((” aren’t. For instance, “((” could become correctly matched by adding two closing brackets at the end, so you’d return 2.

input:  text = “(()”
output: 1

input:  text = “(())”
output: 0

input:  text = “())(”
output: 2

At first I coded up a stack based O(N) Time and O(N) Space Complexity solution, but then he hinted at a constant space solution.

Taking the hint, I had proposed an indisputably O(1) Space solution, however, he claimed it was also N^2 Time Complexity due to the .replace() method being linear and the while loop being linear as well, adding up to exponential time complexity. Does that make sense? Here is the simple algorithm I came up with:

def bracket_match(s):
    while s and '()' in s:
        s = s.replace('()', '')
    return len(s)

CodePudding user response:

I agree with him that this algorithm is not O(n). But this looks quadratic time, not exponential. Although the input is getting shorter, adding up a decreasing sequence can be quadratic. For example, n (n-1) (n-2) ... 1 = n*(n 1)/2. Notice that in addition to the replace method, in operation also is linear time.

You can simply try that out. The performance depends on the input critically, but at least for the ((((())))) type of input, the time complexity looks quadratic.

from datetime import datetime
import pandas as pd
import matplotlib.pyplot as plt
from tqdm import tqdm

def bracket_match(s):
    while s and '()' in s:
        s = s.replace('()', '')
    return len(s)

o = []
for n in tqdm(range(10**3, 3*10**4, 10**3)):
  s = "("*n   ")"*n
  t1 = datetime.now()
  bracket_match(s)
  t2 = datetime.now()
  o.append({"n":n, "time":(t2-t1).total_seconds()})

o = pd.DataFrame(o)
display(o)
plt.scatter(o.n, o.time)
plt.plot(o.n, o.time)

enter image description here

  • Related