Home > Enterprise >  Can I reorder elements in map by value? C
Can I reorder elements in map by value? C

Time:09-21

I want to make a table of letter frequency, just like this

So, I have variable str that contains some text, and I have map<char, int> m, that contains number of occurrences of each letter.

So I have something like that:

vector<char> get_rank(const string& str)
{
    map<char, int> m;

    for (char i = 'a'; i <= 'z'; i  )
        m[i] = 0;

    for (char ch : str)
        m[ch]  ;

    for (auto& it : m)
        v.push_back(it.first);
    return v;
}

But map is ordered by keys, not by values. Is there a way to "overload" the comparator of map to order each element by value? If the answer is no, how can I implement the same idea but with the same elegancy. Because every idea I have seems a little dull and rough.

CodePudding user response:

It is not possible to sort a map with respect to its mapped values. The order of elements of a map is managed by the map according to its comparator which only considers keys.

However, you do not need a map to count frequencies when the keys are contiguous and of limited range.

Instead of a map, use a std::vector<letter_and_count> with

struct letter_and_count {
     char letter;
     int count = 0;
};

Initialize the vector with elements with letter initialized to a till z then increment v[ i - 'a'] when you encounter the letter i in the string. Then use std::sort to sort the vector according to the count member.

PS: Note comments below. The letters a till z are not necessarily contiguous. However (idea also from eerorika) you can use a simple lookup table:

std::string alphabet{"abcdefghijklmnopqrstuvwxyz"};

Then first find the letter in that alphabet and use its index in the alphabet for the counting.

CodePudding user response:

Is there a way to "overload" a comparator of map to order each element by value?

No.

If the answer is no, how can I implement the same idea but with the same elegancy.

Depends on use case. An elegant solution to the example is to sort the vector.

CodePudding user response:

Instead of trying to do something with a STL container that is clearly not designed for this (ordering a map) maybe use a different container.

You can always use std::pair<char, int> as the type for an std::vector.

These can be sorted.

I'd prefer std::vector<std::pair<char,int>> simply because I'm more used to working with vectors but that's up to you.

CodePudding user response:

I find the approach ok to use specialized containers.

For counting a std::map or a std::unordered_map should be used. After that it can be copied into a std::vector and sorted as you like.

This will work in a universal way. It is also possible to find the most frequent used element using a max heap.

Or, In the below example, we can get the x topmost elements. And this with just some few lines of code. The below will count nearly evrything and sortit according to the value:

#include <iostream>
#include <utility>
#include <unordered_map>
#include <algorithm>
#include <vector>
#include <iterator>
#include <type_traits>

// Helper for type trait We want to identify an iterable container ----------------------------------------------------
template <typename Container>
auto isIterableHelper(int) -> decltype (
    std::begin(std::declval<Container&>()) != std::end(std::declval<Container&>()),     // begin/end and operator !=
      std::declval<decltype(std::begin(std::declval<Container&>()))&>(),                // operator   
    void(*std::begin(std::declval<Container&>())),                                      // operator*
    void(),                                                                             // Handle potential operator ,
    std::true_type{});
template <typename T>
std::false_type isIterableHelper(...);

// The type trait -----------------------------------------------------------------------------------------------------
template <typename Container>
using is_iterable = decltype(isIterableHelper<Container>(0));

// Some Alias names for later easier reading --------------------------------------------------------------------------
template <typename Container>
using ValueType = std::decay_t<decltype(*std::begin(std::declval<Container&>()))>;
template <typename Container>
using Pair = std::pair<ValueType<Container>, size_t>;
template <typename Container>
using Counter = std::unordered_map<ValueType<Container>, size_t>;

// Function to get the k most frequent elements used  in any Container ------------------------------------------------
template <class Container>
auto topKFrequent(const Container& data, size_t k) {

    if constexpr (is_iterable<Container>::value) {

        // Count all occurences of data
        Counter<Container> counter{};
        for (const auto& d : data) counter[d]  ;

        // For storing the top k
        std::vector<Pair<Container>> top(k);

        // Get top k
        std::partial_sort_copy(counter.begin(), counter.end(), top.begin(), top.end(),
            [](const std::pair<int, size_t >& p1, const std::pair<int, size_t>& p2) { return p1.second > p2.second; });

        return top;
    }
    else
        return data;
}
int main() {
    std::vector testVector{ 1,2,2,3,3,3,4,4,4,4,5,5,5,5,6,6,6,6,6,7 };
    for (const auto& p : topKFrequent(testVector, 2)) std::cout << "Value: " << p.first << " \t Count: " << p.second << '\n';
    std::cout << '\n';

    double cStyleArray[] = { 1.1, 2.2, 2.2, 3.3, 3.3, 3.3 };
    for (const auto& p : topKFrequent(cStyleArray, 2)) std::cout << "Value: " << p.first << " \t Count: " << p.second << '\n';
    std::cout << '\n';

    std::string s{ "abbcccddddeeeeeffffffggggggg" };
    for (const auto& p : topKFrequent(s, 2)) std::cout << "Value: " << p.first << " \t Count: " << p.second << '\n';
    std::cout << '\n';

    double value = 12.34;
    std::cout << topKFrequent(value, 2) << "\n";

    return 0;
}

The above code is for C 17


And below is an example to find the most frequent element in any container.

But this is a C 20 solution:

#include <iostream>
#include <utility>
#include <unordered_map>
#include <queue>
#include <vector>
#include <iterator>
#include <type_traits>
#include <string>

// Some Alias names for later easier reading --------------------------------------------------------------------------
template <typename Container>
using ValueType = std::ranges::range_value_t<Container>;

template <typename Container>
using Pair = std::pair<ValueType<Container>, size_t>;

template <typename Container>
using Counter = std::unordered_map<ValueType<Container>, size_t>;

template <typename Container>
using UnderlyingContainer = std::vector<Pair<Container>>;

// Predicate Functor
template <class Container> struct LessForSecondOfPair {
    bool operator ()(const Pair<Container>& p1, const Pair<Container>& p2) { return p1.second < p2.second; }
};

template <typename Container>
using MaxHeap = std::priority_queue<Pair<Container>, UnderlyingContainer<Container>, LessForSecondOfPair<Container>>;

// Calculate max element ---------------------------------------------------------------------------------------------
template <class Container>
auto topFrequent(const Container& data) {

    if constexpr (std::ranges::range<Container>) {
        // Count all occurences of data
        Counter<Container> counter{};
        for (const auto& d : data) counter[d]  ;

        // Build a Max-Heap
        MaxHeap<Container> maxHeap(counter.begin(), counter.end());

        // Return most frequent number
        return maxHeap.top().first;
    }
    else
        return data;
}

// Test
int main() {
    std::vector testVector{ 1,2,2,3,3,3,4,4,4,4,5,5,5,5,6,6,6,6,6,7 };
    std::cout << "Most frequent is: " << topFrequent(testVector) << "\n";

    double cStyleArray[] = { 1.1, 2.2, 2.2, 3.3, 3.3, 3.3 };
    std::cout << "Most frequent is: " << topFrequent(cStyleArray) << "\n";

    std::string s{ "abbcccddddeeeeeffffffggggggg" };
    std::cout << "Most frequent is: " << topFrequent(s) << "\n";

    double value = 12.34;
    std::cout << "Most frequent is: " << topFrequent(value) << "\n";

    return 0;
}
  • Related