Performance variance among string-compare methods-CodePudding

As a student, I have a course about using new features in Python3.8 and it also includes Python3.10. So I'm now trying to use match-case in my homework to kill a long series of if-else which should be replace by switch-case in cpp.

In my homework I need a function to convert string "male/female/..." into integer. It's quite easy but I was suddenly interested in the performace of different string-compare methods so I designed the testing code below to test the performance among three different methods.

import timeit
from typing import Callable


def test_match_case(sex: str):
    match sex:
        case 'male':
            gender = 1
        case 'female':
            gender = 2
        case '________female':
            gender = 3
        case '#________________female':
            gender = 4
        case _:
            gender = 0

    return gender


def test_if_full_compare(sex: str):
    if sex == 'male':
        gender = 1
    elif sex == 'female':
        gender = 2
    elif sex == '________female':
        gender = 3
    elif sex == '#________________female':
        gender = 4
    else:
        gender = 0

    return gender


def test_if_startswith(sex: str):
    if sex.startswith('m'):
        gender = 1
    elif sex.startswith('f'):
        gender = 2
    elif sex.startswith('_'):
        gender = 3
    elif sex.startswith('#'):
        gender = 4
    else:
        gender = 0

    return gender


def _test_and_print(func: Callable, arg: str, times: int = 10000000):
    func_name = func.__name__
    time_cost = timeit.timeit(f"""{func_name}("{arg}")""", f"""from __main__ import {func_name}""", number=times)
    print(f"func: {func_name} arg: {arg} time_cost: {time_cost:.6f}")


_test_and_print(test_match_case, "male")
_test_and_print(test_match_case, "female")
_test_and_print(test_match_case, "________female")
_test_and_print(test_match_case, "#________________female")
_test_and_print(test_if_full_compare, "male")
_test_and_print(test_if_full_compare, "female")
_test_and_print(test_if_full_compare, "________female")
_test_and_print(test_if_full_compare, "#________________female")
_test_and_print(test_if_startswith, "male")
_test_and_print(test_if_startswith, "female")
_test_and_print(test_if_startswith, "________female")
_test_and_print(test_if_startswith, "#________________female")

Here is the testing result on my PC:

func: test_match_case arg: male time_cost: 0.517885
func: test_match_case arg: female time_cost: 0.610869
func: test_match_case arg: ________female time_cost: 0.726382
func: test_match_case arg: #________________female time_cost: 0.846604
func: test_if_full_compare arg: male time_cost: 0.487761
func: test_if_full_compare arg: female time_cost: 0.578670
func: test_if_full_compare arg: ________female time_cost: 0.666413
func: test_if_full_compare arg: #________________female time_cost: 0.827917
func: test_if_startswith arg: male time_cost: 0.917373
func: test_if_startswith arg: female time_cost: 1.362152
func: test_if_startswith arg: ________female time_cost: 1.817514
func: test_if_startswith arg: #________________female time_cost: 2.249916

I have learned a little about "Resident Memory" in Python, but I still have tons of questions about the result above.

match-case is slower than if-else in every case, why?
The time-cost gap between L4 and L8 is much smaller than the other three cases (L1-L5, L2-L6, L3-L7), why? Is this effected by "Resident Memory"?
startswith has such terrible speed with long string. In my gut this build-in function only needs to compare the first character so the time-cost of the last 4 cases should be nearly the same and L12 should be much faster than L8 since it don't need to traverse the entire string. But the truth is the exact opposite of my estimation. How to explain this?
If you find out any other question, please tell me.

I've already searched with words "[python] string compare" but didn't get any expected result.

CodePudding user response：

There's three reasons for your performance issues, the first two essentially the language/interpreter's "fault", the other one a consequence of how you constructed your test cases:

Python is not a fully compiled, statically typed language, and this means match can't typically optimize much (if I recall correctly, it might even be required to consider cases in order, making it even harder to optimize), and in practice, never performs meaningful optimizations for "jumping" to the right case. The match construct's flexibility actually adds some overhead over if/elif/else chains here, but it's otherwise equivalent to the if/elif/else chain, making it slower.
Generalized code paths without dedicated bytecode support (e.g. arbitrary method calls) typically have higher fixed overhead than bytecode supported operations (e.g. direct == checks), so if the work done is the same, the bytecode operation wins. They've been improving this over time with LOAD_METHOD/CALL_METHOD opcodes that avoid creating bound method objects, and the Vectorcall protocol to reduce argument handling overhead, but a dedicated bytecode still wins in most cases, so if == and startswith do the same real work, == will win.
Even theoretically, startswith doesn't save anything in this particular use case; every one of your fixed strings is of a different length, so the actual equality test starts off with "do the lengths match?", finds they don't, and immediately says "not equal"; it never even reads a single byte of the string data when they're not equal. And when they are equal, odds are since three of the four strings involved are literals containing legal variable names (which CPython interns as an implementation detail), isn't not even comparing the string's data in the case where they're equal; it checks the "is interned" bit, finds its set in both, and knowing that, can compare the pointer addresses (it might even do this before the size check); two interned strings are equal if and only if they are the same string, so again, no data comparisons required.