I'm a PHP programmer who will be moving over to Python soon.
In the PHP world there are two functions for dealing with string slices mb_substr
and substr
. The multi-byte variant, mb_substr
, is notoriously slow because PHP doesn't know how many bytes each character is; so it needs to iterate over every character and check their length to find the byte position of a given offset.
I wrote the following benchmark to see if Python (3.8) has the same issue:
from random import randrange
from time import time
alphabet = ["a", "ü", "字", "