I have an AoC
problem where I have been given the data below:
data = """2-4,6-8
2-3,4-5
5-7,7-9
2-8,3-7
6-6,4-6
2-6,4-8"""
I need to find the number of pairs which fully contain another pair. For example, 2-8
fully contains 3-7
, and 6-6
is fully contained by 4-6
.
I have solved it using the below code:
def aoc_part1(self, data):
counter = 0
for lines_data in data.splitlines():
lines_data = lines_data.strip()
first_range, second_range = self.__get_first_second_list_of_elements(lines_data)
check_first_side_if_returns_true = all(item in first_range for item in second_range)
check_second_side_if_returns_true = all(item in second_range for item in first_range)
if check_first_side_if_returns_true or check_second_side_if_returns_true:
counter = 1
return counter
def __get_first_second_list_of_elements(self, data):
first_elf, second_elf = data.split(",")[0], data.split(",")[1]
first_range_start, first_range_end = map(int, first_elf.split("-"))
second_range_start, second_range_end = map(int, second_elf.split("-"))
first_range = list(range(first_range_start, first_range_end 1))
second_range = list(range(second_range_start, second_range_end 1))
return first_range, second_range
I was just wondering about the time complexity here. I think it should be a brute force
here because for every iteration all
will run another loop. How can I optimize this solution in order to get linear time complexity?
first_range
and second_range
are of int
types. check_first_side_if_returns_true
and check_second_side_if_returns_true
are the boolean
variables that check if the list is entirely contained or not. Based on that, it returns True
or False
.
CodePudding user response:
Your solution looks pretty complicated. Why not do something like:
data = """2-4,6-8
2-3,4-5
5-7,7-9
2-8,3-7
6-6,4-6
2-6,4-8
"""
def included(line):
(a1, b1), (a2, b2) = (map(int, pair.split("-")) for pair in line.strip().split(","))
return (a1 <= a2 and b2 <= b1) or (a2 <= a1 and b1 <= b2)
print(sum(included(line) for line in data.splitlines()))
I did some timing with my AoC-input for day 4 (1,000 lines):
from timeit import timeit
# Extract the interval boundaries for the pairs
boundaries = [
[tuple(map(int, pair.split("-"))) for pair in line.strip().split(",")]
for line in data.splitlines()
]
# Version 1 with simple comparison of boundaries
def test1(boundaries):
def included(pairs):
(a1, b1), (a2, b2) = pairs
return (a1 <= a2 and b2 <= b1) or (a2 <= a1 and b1 <= b2)
return sum(included(pairs) for pairs in boundaries)
# Version 2 with range-subset test
def test2(boundaries):
def included(pairs):
(a1, b1), (a2, b2) = pairs
numbers1, numbers2 = set(range(a1, b1 1)), set(range(a2, b2 1))
return numbers1 <= numbers2 or numbers2 <= numbers1
return sum(included(pairs) for pairs in boundaries)
# Test for identical result
print(test1(boundaries) == test2(boundaries))
# Timing
for i in 1, 2:
t = timeit(f"test{i}(boundaries)", globals=globals(), number=1_000)
print(f"Duration version {i}: {t:.1f} seconds")
Result here, on a mediocre machine (repl.it):
Duration version 1: 0.4 seconds
Duration version 2: 5.4 seconds
CodePudding user response:
You're prob. making it overcomplicated in the current approach. If you split the pairs to two sets - eg. a, and b then you could easily do a set ops. to check if there is overlapping. That should be faster than yours.
Something like this one-line:
# some input reading, and split to a, b sets.
# count = 0
if set(range(a, b 1)) & set(range(x, y 1)):
count = 1 # that's part1 answer.
There are questions about the memory efficiency about this approach earlier, I've run some profiling and this is the result to share - it's confirmed there should be NO problem given this puzzle's input size.
Filename: day04.py
Line # Mem usage Increment Occurrences Line Contents
=============================================================
27 43.758 MiB 43.758 MiB 1 @profile
28 def part2(file):
29 43.762 MiB 0.004 MiB 1 ans = 0
30
31 43.770 MiB 0.000 MiB 1001 for line in open(file):
32 43.770 MiB 0.004 MiB 1000 a, b, x, y = map(int, line.replace(",", "-").split("-"))
33 43.770 MiB 0.000 MiB 1000 if set(range(a, b 1)) & set(range(x, y 1)):
34 43.770 MiB 0.004 MiB 847 ans = 1
35
36 43.770 MiB 0.000 MiB 1 return ans