Matching a regex on a std::string_view
works fine. But when I return matched substrings, they die for some reason. std::string_view
argument is being destroyed upon the end of the function's scope, but the memory it points to is valid.
I expected std::match_results
to point to the initial array and not to make any copies, but the behavior I observe shows that I am wrong.
Is it possible to make this function work without additional allocations for substrings?
#include <tuple>
#include <regex>
#include <string_view>
#include <iostream>
using configuration_str = std::string_view;
using platform_str = std::string_view;
std::tuple<configuration_str, platform_str> parse_condition_str(std::string_view conditionValue)
{
// TODO: fix regex
constexpr const auto ®exStr =
R"((?:\'\$\(Configuration\)\s*\|\s*\$\(Platform\)\s*\'==\'\s*)(. )\|(. )')";
static std::regex regex{ regexStr };
std::match_results<typename decltype(conditionValue)::const_iterator> matchResults{};
bool matched =
std::regex_match(conditionValue.cbegin(), conditionValue.cend(), matchResults, regex);
(void)matched;
std::string_view config = matchResults[1].str();
std::string_view platform = matchResults[2].str();
return { config, platform };
}
int main()
{
const auto &stringLiteralThatIsALIVE = "'$(Configuration)|$(Platform)'=='Release|x64'";
const auto&[config, platform] = parse_condition_str(stringLiteralThatIsALIVE);
std::cout << "config: " << config << "\nplatform: " << platform << std::endl;
return 0;
}
https://godbolt.org/z/TeYMnn56z
CLang-tydy shows a warning: Object backing the pointer will be destroyed at the end of the full expression
std::string_view platform = matchResults[2].str();
CodePudding user response:
For example, let's look at the following line:
std::string_view config = matchResults[1].str();
Here, matchResults
is of type std::match_results
, and [1]
is its std::match_results::operator[]
, which returns an std::sub_match
.
But then, .str()
is its std::sub_match::str()
, which returns an std::basic_string
.
This returned temporary sting object will be destroyed at the end of the full-expression (thanks, @BenVoigt, for the correction), i.e., in this case, immediately after the config
gets initialized and the line in question finishes executing. So, the Clang's warning you quote is correct.
By the time when the parse_condition_str()
function returns, both the config
and platform
string-views will thus be pointing into already destroyed strings.
CodePudding user response:
Manually specifying pointer with offset and length yields the desirable results:
std::string_view config{conditionValue.data() matchResults.position(1), matchResults.length(1)};
std::string_view platform{conditionValue.data() matchResults.position(2), matchResults.length(2)};
https://godbolt.org/z/cGjs39Ehq
However the question still stands in regards to why .str()
method on a submatch returns a temporary and results in garbage.