I am converting a list to set and back to list. I know that set(list)
takes O(n) time, but I am converting it back to list in same line list(set(list))
. Since both these operations take O(n) time, would the time complexity be O(n^2) now?
Logic 1:
final = list(set(list1)-set(list2))
Logic 2:
s = set(list1)-set(list2)
final = list(s)
Do both these implementations have different time complexities, and if they do which of them is more efficient?
CodePudding user response:
One set conversion is unnecessary, as set.difference
works with any iterable as it should:
final = list(set(list1).difference(list2))
But the asymptotic time and space complexity of the whole thing is still O(m n)
(the sizes of the two lists).
CodePudding user response:
In both versions you are doing:
- convert a list to a set
- convert another list to a set
- subtract the two sets
- convert the resultant set to a list
Each of those is O(n), where n is a bound on the size of both your starting lists.
That's O(n) four times, which makes it overall O(n).
Your two versions are essentially identical.
CodePudding user response:
Your two versions are identical, as from dis
:
>>> dis.dis("list(set(list1) - set(list2))")
1 0 LOAD_NAME 0 (list)
2 LOAD_NAME 1 (set)
4 LOAD_NAME 2 (list1)
6 CALL_FUNCTION 1
8 LOAD_NAME 1 (set)
10 LOAD_NAME 3 (list2)
12 CALL_FUNCTION 1
14 BINARY_SUBTRACT
16 CALL_FUNCTION 1
18 RETURN_VALUE
>>> dis.dis("s = set(list1) - set(list2);list(s)")
1 0 LOAD_NAME 0 (set)
2 LOAD_NAME 1 (list1)
4 CALL_FUNCTION 1
6 LOAD_NAME 0 (set)
8 LOAD_NAME 2 (list2)
10 CALL_FUNCTION 1
12 BINARY_SUBTRACT
14 STORE_NAME 3 (s)
16 LOAD_NAME 4 (list)
18 LOAD_NAME 3 (s)
20 CALL_FUNCTION 1
22 POP_TOP
24 LOAD_CONST 0 (None)
26 RETURN_VALUE
>>>
The only difference between these two is that the second one stores it to a variable s
, so it has to LOAD_NAME
the name s
. And in the first code the list
is called first, but in the second one get's called later.
But in @user2390182's answer, instead of a second loaded set
LOAD_NAME
, it loads the function name difference
instead, which IMO is the most efficient one here:
>>> dis.dis("list(set(list1).difference(list2))")
1 0 LOAD_NAME 0 (list)
2 LOAD_NAME 1 (set)
4 LOAD_NAME 2 (list1)
6 CALL_FUNCTION 1
8 LOAD_METHOD 3 (difference)
10 LOAD_NAME 4 (list2)
12 CALL_METHOD 1
14 CALL_FUNCTION 1
16 RETURN_VALUE
>>>