subset std::map into std::vector<std::map> of predefined length-CodePudding

im trying to subset a std::map<string,string> into a vector of maps, each with predefined length rest. Im trying to follow the solution found in this Can a std::map be efficiently split into two std::maps at an iterator? The problem here: I don't want to split the map into two parts, but into n parts each of equal size rest. That's how im trying to achieve this (not compiling because of a nullptr):

#include <iostream>
#include <string>
#include <vector>
#include <string>
#include <map>

int main(){
using namespace std::string_literals;

    auto code = std::map<std::string, std::string>{
        {"Red"s, "Red"s},
        {"Blue"s, "Blue"s},
        {"Green"s, "Green"s},
        {"Fuchsia"s, "Fuchsia"s},
        {"Mauve"s, "Mauve"s},
        { "Gamboge"s, "Gamboge"s },
        {"Vermillion"s, "Vermillion"s}
    };
    
    std::vector<std::map<std::string,std::string>> subsetCode;
    auto it = code.begin();
    auto bt = code.begin();
    for (size_t i = 0; i < code.size(); i  = 2)
        {
            auto last = std::min(code.size(), i   2);
            
            std::advance(it, last);
            std::advance(bt, last-2);
            subsetCode.push_back(std::map{
                std::make_move_iterator(bt),
                std::make_move_iterator(it)});
        }
    
    for (int i = 0; i < subsetCode.size(); i  ) {
        for (auto [key, value]: subsetCode[i]){
            std::cout << key << ":" << value << " ";
        }
        std::cout << " " << std::endl;
    }

    return 1;
}

I think i am stuck with moving the iterator for the lower bound. Thank you for your help!

CodePudding user response：

if you want to subset the map to n parts, the sub map's size equal map.size()/n, hear is a o(n) solution:

#include <iostream>
#include <string>
#include <vector>
#include <map>

int main() {
    auto code = std::map<std::string, std::string>{
        {"Red", "Red"},
        {"Blue", "Blue"},
        {"Green", "Green"},
        {"Fuchsia", "Fuchsia"},
        {"Mauve", "Mauve"},
        { "Gamboge", "Gamboge" },
        {"Vermillion", "Vermillion"}
    };
    
    // assume you want each part size is 2.
    constexpr int subSize = 2; 

    std::vector<std::map<std::string, std::string>> subsetCodes;    
    std::map<std::string, std::string> subset;
    for (auto& item : code) {
        subset.insert({item.first, std::move(item.second)});
        while (subset.size() == subSize) {
            subsetCodes.push_back(std::move(subset));
            subset = std::map<std::string, std::string>();
        }
    }
    if (!subset.empty())
      subsetCodes.push_back(std::move(subset));
    code.clear();
    
    for (int i = 0; i < subsetCodes.size(); i  ) {
        for (auto [key, value]: subsetCodes[i]){
            std::cout << key << ":" << value << " ";
        }
        std::cout << " " << std::endl;
    }
}

CodePudding user response：

IMHO “the rest” is poorly defined, because (e.g.) splitting 20 elements into 11 parts would yield 10 parts with 1 element each and a “rest” with 10 elements, which is (sort of) ugly if the splitting should be uniform. I would instead use 9 parts with 2 elements in each and then 2 “rests” with 1 element in each. But that’s just a (early) side note; you can tweak this all you like.

The important part is probably the iterator. Well, because you are removing elements from and inserting elements into the same type of map, your best bet is extract() and its counterpart, the overload of insert() that takes nodes. The beauty of this is that the type(s) you use in the map do not need to support move-semantics at all in this case; the entire node remains in existence and therefore nothing needs to be moved. No move_iterator is involved here either — that would only matter if we dereferenced the iterator, which we don’t do.

First let’s define the map splitting (node transfer) for any map-like type and for any vector-like type, simply because we can:

template<template<typename ... S> class R,
         template<typename ... A> class C,
         typename I,
         typename ... S, typename ... A>
void
transfer_segments(C<A...> &source, I &it,
                  size_t n_segs, const size_t seg_size,
                  R<C<A...>, S...> &sink) {
  for (; n_segs; --n_segs) {
    auto &seg{sink.emplace_back()};
    for (size_t i{0}; i < seg_size;   i)
      seg.insert(source.extract(it  ));
  }
}

template<template<typename ... S> class R,
         template<typename ... A> class C,
         typename ... S, typename ... A>
void
split_container(C<A...> &&source, const size_t n_segs,
                R<C<A...>, S...> &sink) {
  const size_t seg_leftover{source.size() % n_segs};
  const size_t seg_size{source.size() / n_segs};
  auto it{source.begin()};
  transfer_segments(source, it, seg_leftover, seg_size   1, sink);
  transfer_segments(source, it, n_segs - seg_leftover, seg_size, sink);
}

These templates could gain (much needed) extra type-safety using C 20 concepts, but this is omitted for the sake of brevity. Next we can test the solution on your data and types:

#include <iostream>
#include <map>
#include <string>
#include <vector>

namespace { /* ... magic templates from above go here ... */ }

int main() {
  const auto get_map{
    []() -> std::map<std::string, std::string> {
      using namespace std::string_literals;
      return {{"Red"s, "Red"s},
              {"Blue"s, "Blue"s},
              {"Green"s, "Green"s},
              {"Fuchsia"s, "Fuchsia"s},
              {"Mauve"s, "Mauve"s},
              {"Gamboge"s, "Gamboge"s},
              {"Vermillion"s, "Vermillion"s}};
    }};

  for (size_t n_segs{1}; n_segs <= 7;   n_segs) {
    std::cout << n_segs << ": {\n";
    std::vector<std::map<std::string, std::string>> segs;
    split_container(get_map(), n_segs, segs);
    for (const auto &seg : segs) {
      std::cout << "     (";
      for (const auto &[k, v] : seg)
        std::cout << " [" << k << ':' << v << ']';
      std::cout << " )\n";
    }
    std::cout << "   }\n";
  }
}

And the output is:

1: {
     ( [Blue:Blue] [Fuchsia:Fuchsia] [Gamboge:Gamboge] [Green:Green] [Mauve:Mauve] [Red:Red] [Vermillion:Vermillion] )
   }
2: {
     ( [Blue:Blue] [Fuchsia:Fuchsia] [Gamboge:Gamboge] [Green:Green] )
     ( [Mauve:Mauve] [Red:Red] [Vermillion:Vermillion] )
   }
3: {
     ( [Blue:Blue] [Fuchsia:Fuchsia] [Gamboge:Gamboge] )
     ( [Green:Green] [Mauve:Mauve] )
     ( [Red:Red] [Vermillion:Vermillion] )
   }
4: {
     ( [Blue:Blue] [Fuchsia:Fuchsia] )
     ( [Gamboge:Gamboge] [Green:Green] )
     ( [Mauve:Mauve] [Red:Red] )
     ( [Vermillion:Vermillion] )
   }
5: {
     ( [Blue:Blue] [Fuchsia:Fuchsia] )
     ( [Gamboge:Gamboge] [Green:Green] )
     ( [Mauve:Mauve] )
     ( [Red:Red] )
     ( [Vermillion:Vermillion] )
   }
6: {
     ( [Blue:Blue] [Fuchsia:Fuchsia] )
     ( [Gamboge:Gamboge] )
     ( [Green:Green] )
     ( [Mauve:Mauve] )
     ( [Red:Red] )
     ( [Vermillion:Vermillion] )
   }
7: {
     ( [Blue:Blue] )
     ( [Fuchsia:Fuchsia] )
     ( [Gamboge:Gamboge] )
     ( [Green:Green] )
     ( [Mauve:Mauve] )
     ( [Red:Red] )
     ( [Vermillion:Vermillion] )
   }

When it comes to a different kind of splitting into uniform parts and “the rest”, the algorithm can be easily adjusted.