Home > Software design >  Can we reduce the time complexity here?
Can we reduce the time complexity here?

Time:12-05

I have an AoC problem where I have been given the data below:

data = """2-4,6-8
    2-3,4-5
    5-7,7-9
    2-8,3-7
    6-6,4-6
    2-6,4-8"""

I need to find the number of pairs which fully contain another pair. For example, 2-8 fully contains 3-7, and 6-6 is fully contained by 4-6.

I have solved it using the below code:

 def aoc_part1(self, data):
        counter = 0
        for lines_data in data.splitlines():
            lines_data = lines_data.strip()
            first_range, second_range = self.__get_first_second_list_of_elements(lines_data)
            check_first_side_if_returns_true = all(item in first_range for item in second_range)
            check_second_side_if_returns_true = all(item in second_range for item in first_range)
            if check_first_side_if_returns_true or check_second_side_if_returns_true:
                counter  = 1
        return counter
def __get_first_second_list_of_elements(self, data):
        first_elf, second_elf = data.split(",")[0], data.split(",")[1]
        first_range_start, first_range_end = map(int, first_elf.split("-"))
        second_range_start, second_range_end = map(int, second_elf.split("-"))
        first_range = list(range(first_range_start, first_range_end   1))
        second_range = list(range(second_range_start, second_range_end   1))
        return first_range, second_range

I was just wondering about the time complexity here. I think it should be a brute force here because for every iteration all will run another loop. How can I optimize this solution in order to get linear time complexity?

first_range and second_range are of int types. check_first_side_if_returns_true and check_second_side_if_returns_true are the boolean variables that check if the list is entirely contained or not. Based on that, it returns True or False.

CodePudding user response:

Your solution looks pretty complicated. Why not do something like:

data = """2-4,6-8
2-3,4-5
5-7,7-9
2-8,3-7
6-6,4-6
2-6,4-8
"""

def included(line):
    (a1, b1), (a2, b2) = (map(int, pair.split("-")) for pair in line.strip().split(","))
    return (a1 <= a2 and b2 <= b1) or (a2 <= a1 and b1 <= b2)

print(sum(included(line) for line in data.splitlines()))

I did some timing with my AoC-input for day 4 (1,000 lines):

from timeit import timeit

# Extract the interval boundaries for the pairs
boundaries = [
    [tuple(map(int, pair.split("-"))) for pair in line.strip().split(",")]
    for line in data.splitlines()
]

# Version 1 with simple comparison of boundaries
def test1(boundaries):
    def included(pairs):
        (a1, b1), (a2, b2) = pairs
        return (a1 <= a2 and b2 <= b1) or (a2 <= a1 and b1 <= b2)
    
    return sum(included(pairs) for pairs in boundaries)

# Version 2 with range-subset test
def test2(boundaries):
    def included(pairs):
        (a1, b1), (a2, b2) = pairs
        numbers1, numbers2 = set(range(a1, b1   1)), set(range(a2, b2   1))
        return numbers1 <= numbers2 or numbers2 <= numbers1
        
    return sum(included(pairs) for pairs in boundaries)

# Test for identical result
print(test1(boundaries) == test2(boundaries))

# Timing
for i in 1, 2:
    t = timeit(f"test{i}(boundaries)", globals=globals(), number=1_000)
    print(f"Duration version {i}: {t:.1f} seconds")

Result here, on a mediocre machine (repl.it):

Duration version 1: 0.4 seconds
Duration version 2: 5.4 seconds

CodePudding user response:

You're prob. making it overcomplicated in the current approach. If you split the pairs to two sets - eg. a, and b then you could easily do a set ops. to check if there is overlapping. That should be faster than yours.

Something like this one-line:

   # some input reading, and split to a, b sets.
   # count = 0

   if set(range(a, b   1)) & set(range(x, y   1)):
      count  = 1                # that's part1 answer.

There are questions about the memory efficiency about this approach earlier, I've run some profiling and this is the result to share - it's confirmed there should be NO problem given this puzzle's input size.

Filename: day04.py

Line #    Mem usage    Increment  Occurrences   Line Contents
=============================================================
    27   43.758 MiB   43.758 MiB           1   @profile
    28                                         def part2(file):
    29   43.762 MiB    0.004 MiB           1       ans = 0
    30
    31   43.770 MiB    0.000 MiB        1001       for line in open(file):
    32   43.770 MiB    0.004 MiB        1000           a, b, x, y = map(int, line.replace(",", "-").split("-"))
    33   43.770 MiB    0.000 MiB        1000           if set(range(a, b   1)) & set(range(x, y   1)):
    34   43.770 MiB    0.004 MiB         847               ans  = 1
    35
    36   43.770 MiB    0.000 MiB           1       return ans
  • Related