We have a formatting library similar to fmtlib which allows us to format strings like this:
int foo = 1;
double bar = 2.3;
FMT("foo={} bar={}", foo, bar);
We recently added parsing of the format string at compile time, allowing us to specify, eg width, precision etc directly in the format string
// print foo with width=5, pad with 0; print bar with width 6, precision 2
// as decimal floating point
FMT("foo={:05} bar={:6.2f}", foo, bar);
The way we do this is as follows:
We have a struct which I can use to pass the format string as a template parameter
struct FmtString
{
template<std::size_t N>
constexpr FmtString(const char (&s)[N])
: str(s)
{}
const char* const str;
};
Together with this FmtString
and the arguments to be formatted, we use the following somewhat convoluted dance
to generate a std::tuple
of FmtSpecT
format specifications.
We use the variadic template parameter pack to generate a std::index_sequence
:
// use Ts... to create a tuple of the argument types and an std::index_sequence
template<FmtString S, typename... Ts>
struct CompiledFmtSpec
{
using type = decltype(compileFmtSpec<S, std::tuple<Ts...>>(std::index_sequence_for<Ts...> {}));
};
// pull the indices out of the std::index_sequence
template<FmtString S, typename Tuple, std::size_t... Is>
consteval auto compileFmtSpec(std::index_sequence<Is...>)
{
return typename FmtSpecs<S, Tuple, Is...>::type {};
}
We use the std::index_sequence
to generate a variadic std::size_t
sequence which we can use to
extract each argument from the tuple of arguments, and its respective sequence value
template<FmtString S, typename Tuple, std::size_t... Is>
struct FmtSpecs
{
using type = std::tuple<FmtSpecWrapper<S, std::tuple_element_t<Is, Tuple>, Is>...>;
};
This results in the following tuple:
std::tuple<
FmtSpecWrapper<S, T0, 0>,
FmtSpecWrapper<S, T1, 1>,
FmtSpecWrapper<S, T2, 2>,
...
FmtSpecWrapper<S, Tn, n>>
We can then specialise FmtSpecWrapper
for the particular type and parse its supported format spec
Here is the default implementation
template<FmtString S, typename T, std::size_t I>
struct FmtSpecWrapper
{
using type = decltype(parseFmtString<S, I>());
};
template<FmtString S, std::size_t I>
consteval auto parseFmtString()
{
constexpr const char* pos = findFmtSpec(S, I);
constexpr FmtSpec fmt = parseFmtSpec(pos);
return FmtSpecT<fmt.pad, fmt.align_right, fmt.sign, fmt.width, fmt.precision, fmt.type, fmt.spec_len> {};
}
This results in a FmtSpecT
specialisation, which has the following NTTPs:
template<char Pad, bool AlignRight, bool Sign, int Width, int Precision, char Type, int SpecLen>
struct FmtSpecT
{
static constexpr char pad = Pad;
static constexpr bool align_right = AlignRight;
static constexpr bool sign = Sign;
static constexpr int width = Width;
static constexpr int precision = Precision;
static constexpr char type = Type;
static constexpr int spec_len = SpecLen;
};
During formatting we can then pull out each FmtSpecT
and use the members to format the argument.
The issue I'd like to try and solve is findFmtSpec(S, I)
.
Its job is to find the I'th
format spec from the format string S
.
constexpr const char* findFmtSpec(const FmtString& S, std::size_t i)
{
std::size_t curr = 0;
const char* next = S.str;
while (*next)
{
const char c = *next ;
if (c == '{')
{
if (c == *next) // found an escaped brace; {{ or }}
{
next;
}
else // found the start of the fmt spec
{
if (curr == i)
return next;
curr;
next;
}
}
}
throw std::domain_error("too many arguments for format string");
}
My issue is that it is Big O N^2, as it starts from 0 for every argument.
Ideally I would be able to start from the end of the previous format spec and seek forwards to find the next, turning the complexity into O(N).
Any ideas on how I can modify findFmtSpec
so that for each i
it starts from the return value of the previouc call?
CodePudding user response:
FmtString S
is a template parameter, so it is usable in constant expressions.
Therefore all you need to do is write a constexpr
function that extracts the specifiers up-front into a suitable container type, for example std::array<std::string_view, N>
. You can either enforce directly in the function that the number of specifiers matches sizeof...(Ts)
, so that N
can just be that, or you can write another constexpr
function which first counts the number of arguments:
struct FmtString
{
template<std::size_t N>
constexpr FmtString(const char (&s)[N])
: str(s)
{}
const char* const str;
template<FmtString S>
static constexpr auto findFmtSpec = []{
constexpr auto N = []{
// iterate `S.str` here and return number of specifiers
}();
std::array<const char*, N> result;
// iterate `S.str` again and store beginning of format specifiers consecutively in `result`.
return result;
}();
};
Then parseFmtString
can just use FmtString::findFmtSpec<S>[I]
. I am using a static member variable template instead of a static function template to make sure that the compiler doesn't keep reevaluating an equivalent function call, although I think the compiler should memoize calls with the same arguments. I used nested lambdas for brevity, but you might want to put these into separate functions. There is also no particular reason to have the variable template be a member. It works outside the class as well.
You can also just generate the whole tuple this way:
template<FmtString S, typename... Ts>
static constexpr auto specs = []{
// number of specifiers, can be `sizeof...(Ts)`
// if no specific error handling is required
constexpr auto N = []{
// iterate `S.str` here and return number of specifiers
}();
if constexpr(sizeof...(Ts) != N) {
// handle argument number mismatch
}
// specs as values
constexpr auto specs = []{
std::array<FmtSpec, N> specs;
// iterate `S.str` again and store `FmtSpec` for each specifier consecutively in `specs`
return specs;
}();
// lift specs into type (template arguments)
return []<std::size_t... Is>(std::index_sequence<Is...>){
return std::tuple<FmtSpecWrapper<S, Ts, FmtSpecT<specs[Is].pad, /*...*/>>...>{};
}(std::make_index_sequence<N>{});
}();