I solved the following leet code problem https://leetcode.com/problems/find-kth-bit-in-nth-binary-string/ by using strings (class Solution
) and integers (class Solution2
). I thought the integer solution should be a lot faster and using much less memory. But actually it takes a lot longer. With n=18 and k=200 it took 10.1 seconds compared to 0.13 seconds of the string solution.
import time
class Solution:
def findKthBit(self, n: int, k: int) -> str:
i = 0
Sn = "0"
Sn = self.findR(Sn, i, n)
return Sn[k-1]
def findR(self, Sn, i, n):
if i == n:
return Sn
newSn = self.calcNewSn(Sn, i)
return self.findR(newSn, i 1, n)
def calcNewSn(self, Sn, i):
inverted = ""
for c in Sn:
inverted = "1" if c == "0" else "0"
newSn = Sn "1" (inverted)[::-1]
return newSn
class Solution2:
def findKthBitBin(self, n: int, k: int) -> str:
i = 0
# MSB (S1) has to be 1 for bit operations but is actually always 0
Sn = 1
Sn = self.findRBin(Sn, i, n)
lenSn = (2**(n 1)) - 1
return "0" if k == 1 else str((Sn >> (lenSn - k)) & 1)
def findRBin(self, Sn, i, n):
if i == n:
return Sn
newSn = self.calcNewSnBin(Sn, i)
return self.findRBin(newSn, i 1, n)
def calcNewSnBin(self, Sn, i):
lenSn = 2**(i 1) - 1
newSn = (Sn << 1) | 1
inverted = (~Sn | (1 << (lenSn - 1))) & self.getMask(lenSn)
newSn = self.reverseBits(newSn, inverted, lenSn)
return newSn
def getMask(self, lenSn):
mask = 0
for i in range(lenSn):
mask |= (1 << i)
return mask
def reverseBits(self, newSn, inverted, lenSn):
for i in range(lenSn):
newSn <<= 1
newSn |= inverted & 1
inverted >>= 1
return newSn
sol = Solution()
sol2 = Solution2()
start = time.time()
print(sol.findKthBit(18, 200))
end = time.time()
print(f"time using strings: {end - start}")
start = time.time()
print(sol2.findKthBitBin(18, 200))
end = time.time()
print(f"time using large integers: {end - start}")
Why is the solution using integers so slow? Are the implemetations of big ints faster in other languages compared to strings?
CodePudding user response:
The problem isn't how fast or slow the integer implementation is. The problem is that you've written an algorithm that wants a random-access mutable sequence data type, like a list, and integers are not a random-access mutable sequence.
For example, look at your reverseBits
implementation:
def reverseBits(self, newSn, inverted, lenSn):
for i in range(lenSn):
newSn <<= 1
newSn |= inverted & 1
inverted >>= 1
return newSn
With a list, you could just call list.reverse()
. Instead, your code shifts newSn
and inverted
over and over, copying almost all the data involved repeatedly on every iteration.
Strings aren't a random-access mutable sequence data type either, but they're closer, and the standard implementation of Python has a weird optimization that actually does try to mutate strings in loops like the one you've written:
def calcNewSn(self, Sn, i):
inverted = ""
for c in Sn:
inverted = "1" if c == "0" else "0"
newSn = Sn "1" (inverted)[::-1]
return newSn
Instead of building a new string and copying all the data for the =
, Python tries to resize the string in place. When that works, it avoids most of the unnecessary copying you're doing with the integer-based implementation.
The right way to write your algorithm would be to use lists, instead of strings or integers. However, there's an even better way, with a different algorithm that doesn't need to build a bit sequence at all. You don't need the whole bit sequence. You just need to compute a single bit. Figure out how to compute just that bit.
CodePudding user response:
The bulk of the time is spent in reverseBits()
. Overall, you're creating hundreds of thousands of new integer objects in that, up to 262143 bits long. It's essentially quadratic time. The corresponding operation for the string form is just (inverted)[::-1]
, which runs entirely at "C speed", and is worst-case linear time.