Home > Software design >  Duplicates When Printing Powersets
Duplicates When Printing Powersets

Time:10-01

from typing import *
def print_powerset(nums: List[int], n: int):
    def print_subsets(i: int, path: List[int]) -> None:
        for child_index in range(i, n):
            curr_path = path   [nums[child_index]]
            print(curr_path)

            for l in range(child_index   1, n):
                print_subsets(l, curr_path)

    for k in range(n): 
        print([nums[k]])
    for j in range(1, n): 
        print_subsets(j, [nums[0]])

if __name__ == "__main__":
    nums = [0,1,2,3]
    n = len(nums)
    print_powerset(nums, n)

Running the code above will generate the following:

"""
[0]
[1]
[2]
[3]
[0, 1]
[0, 1, 2]
[0, 1, 2, 3]
[0, 1, 3]
[0, 1, 3] <- extra
[0, 2]
[0, 2, 3]
[0, 3]
[0, 2] <- extra
[0, 2, 3] <- extra
[0, 3] <- extra
[0, 3] <- extra
"""

I'm can't figure out what's causing the duplicates to be printed out. I know there are other ways of printing out the powerset but for my current purpose I'm only interested in the cause of my issue above.

CodePudding user response:

It's hard to pin down a single issue that's causing duplicates: print_subsets(j, [nums[0]]) seems like it should be print_subsets(j, [nums[j-1]]). Printing curr_path inside the loop also seems off; is there an existing algorithm for powerset that you were trying to implement? Your output is also missing several subsets in addition to having duplicates.

Deleting several lines and moving the print outside the loop gives a working algorithm, perhaps this is what you meant?

def print_powerset(nums: List[int], n: int):
    def print_subsets(i: int, path: List[int]) -> None:
        print(path)
        for child_index in range(i, n):
            print_subsets(child_index 1, path   [nums[child_index]])

    print([])
    for j in range(1, n 1):
        print_subsets(j, [nums[j-1]])

CodePudding user response:

This addresses your issue by re-approaching the problem using recursion. The reason you might want to do this is because your solution isn't just wrong because it has duplicates, but it also lacks the empty set and the following elements of a power set:

[1, 2]
[1, 3]
[1, 2, 3]
[2, 3]

For a powerset, you should expect 2^n elements, where n is the length of the input. So here we should expect an output of length 16 - notably the same as what you got.

Anyway, the recursive way to do this is to recognize that each n 1 powerset contains all the elements of n plus one element for each n with the 1 element added. Thus, the power set for [0] will be [[], [0]] and [0, 1] will be [[], [0], [1], [0, 1]. Note the last two elements are the first two with 1 added to each.

Turning this into code is relatively simple:

def powerset(ls: List[int]) -> List[List[int]]:  # Typing helps us thing about how this works!
    if len(ls) == 0:  # This is our base case.
        return [[]]
    else:
        head = ls[0]  # The first element
        tail = ls[1:]  # Remainder of the list
        tail_component = powerset(tail)  # Get the recursive half
        head_component = [l   [head] for l in tail_component]
        return tail_component   head_component  # Concatenate these lists and return the new powerset

Then we just need to invoke it and iterate over the result to print it:

if __name__ == "__main__":
    nums = [0,1,2,3]
    ps = powerset(nums)
    [print(i) for i in ps]

# This will print out:
[]
[3]
[2]
[3, 2]
[1]
[3, 1]
[2, 1]
[3, 2, 1]
[0]
[3, 0]
[2, 0]
[3, 2, 0]
[1, 0]
[3, 1, 0]
[2, 1, 0]
[3, 2, 1, 0]

Note that this is not efficient - in particular I'm recreating the lists a lot. This is memory intensive, and can be done faster using generators and other tricks. However, I wanted to give you a direct representation of the algorithm to demonstrate how much more clear this approach is than trying to do it via for loops.

To determine what is going wrong with your original program, lets unroll (part of) it:

# for k in range(4)
print([0,1,2,3][0]) -> [0]
print([0,1,2,3][1]) -> [1]
print([0,1,2,3][2]) -> [2]
print([0,1,2,3][3]) -> [3]
# for j in range(1, 4)
print_subsets(1, list([0,1,2,3][0])) -> print_subsets(1, [0])
  # for child_index in range(1, 4)
  curr_path = [0]   [0,1,2,3][1] -> [0, 1]
  print(curr_path) -> [0, 1]
  # for l in range(2, 4)
  print_subsets(2, [0, 1]) -> recurses
  ...

Now, the thing to note is that your curr_path is being constructed from [nums[0]]. This means that the path for each iteration will always begin with a 0. What you probably meant to do is iterate that in the for j in range... clause. But this is very confused, in part because it's not clear when your variables are indexes versus values, and because you're relying on scope to get access to the higher level list. These practices cause quick confusion and tangling of code.

CodePudding user response:

So the second nested loop shouldn't have been there. This is how I should've written it instead:

def print_powerset(nums: List[int], n: int):
    def print_subsets(i: int, path: List[int]) -> None:
        curr_path = path   [nums[i]]
        print(curr_path)

        for child_index in range(i   1, n):
            print_subsets(child_index, curr_path)

    print([])
    for k in range(n): 
        print([nums[k]])
    for start_index in range(n):
        for child_index in range(start_index   1, n): 
            print_subsets(child_index, [nums[start_index]])

if __name__ == "__main__":
    nums = [0,1,2,3]
    n = len(nums)
    print_powerset(nums, n)
  • Related