Home > Mobile >  C program behaviour different between optimizations
C program behaviour different between optimizations

Time:10-29

My question is in relation to this little code snippet:

typedef std::map<std::string, std::string> my_map_t;

std::string_view get_value_worse(const my_map_t& input_map, std::string_view value)
{
    auto retrieved = input_map.find(value.data());
    return retrieved != input_map.cend() ? retrieved->second : "";
}

std::string_view get_value_better(const my_map_t& input_map, std::string_view value)
{
    auto retrieved = input_map.find(value.data());
    if (retrieved != input_map.cend())
    {
        return retrieved->second;
    }
    return "";
}

int main()
{
    my_map_t my_map = {
        {"key_0", "value_0"},
        {"key_1", "value_1"},
    };

    std::cout << (get_value_worse(my_map, "key_0") == get_value_better(my_map, "key_0")) << std::endl;
}

Under the latest gcc with no optimisations this prints 0 for false, while under -O3 this prints 1 for true.

I believe the un-optimised behaviour is because the second and third comparison operator arguments are expressions, not statements - and so the retrieved->second in retrieved != arguments.cend() ? retrieved->second : "" gets evaluated as a string construction on the stack, and returning a string_view to that is bad.

I can also see that with -O3 the compiler would be able to inline all of this, remove branching, and be done with it... but I would have expected -O3 to act exactly "as if" I had compiled with -O0.

Can anyone explain why the compiler gets to elide the copy construction I believe is happening in the -O0 version?

CodePudding user response:

In the conditional expression, a temporary std::string object is constructed. Temporary object are usually constructed on the stack, although this is an implementation detail that is not important. The important thing is that the temporary object is destroyed at the end of the return statement, so the returned std::string_view is dangling. Attempting to access the data it points to (using the == operator, or otherwise) results in undefined behaviour.

When a program contains undefined behaviour, the compiler can do whatever it wants with it. In particular, the compiler is permitted to optimize by assuming that a condition that implies undefined behaviour will always be false. If it turns out that this assumption is wrong, then the compiler is off the hook (because it means undefined behaviour is occurring). What kind of assumption exactly is being made by your compiler is not clear. It also doesn't really matter, because you can't depend on the behaviour that you see now. You should just rewrite your program to remove the undefined behaviour.

  • Related