By 101010


2014-06-13 19:09:06 8 Comments

I'll start by illustrating a simple use case example:

  • Consider the problem of a social security ID database, where in C++ code is modelled as a std::unordered_map where its key is the social security ID of a person and its value is a std::string with the full-name of that person (e.g., std::unordered_map<int, std::string> DB;).

  • Consider also, that there's a request for printing this database sorted in ascending order based on the person's ID (i.e., std::unordered_map's key).

  • Naively, one would think to use std::sort in order to sort the std::unordered_map according to the requested criteria and then print it, like the example code below:


   std::sort(DB.begin(), DB.end());
   for(auto p : DB) std::cout << "ID(" << p.first
                              << ") - " 
                              << p.second 
                              << std::endl;

  • However, this is not the case, because use of std::sort with a range of either a std::unordered_map or a std::unordered_set will raise a compiler error.

Questions:

  1. Why STL's unordered containers cannot be sorted by std::sort?
  2. Is there a legitimate and efficient way to sort either a std::unordered_map or a std::unordered_set?

3 comments

@Tony Delroy 2015-06-22 03:56:13

1.Why STL's unordered containers cannot be sorted by std::sort?

Because unordered containers are already "sorted", albeit not directly by their keys, but by (typically) hash_function(key) % bucket_count() (also accessible as bucket(key)). This "sort" order isn't cosmetic - it's the whole basis on which hash tables are able to find elements quickly. If std::sort were allowed to re-order the elements by key instead, then the container would no longer be able to function as a hash table: elements couldn't be reliably found or erased, insertions might put duplicates in the container etc..

2.Is there a legitimate and efficient way to sort either a std::unordered_map or a std::unordered_set?

In the general case, only by first copying the elements to a sortable or sorted container such as std::vector or std::set (the former will usually be faster, but benchmark both if you really care):

std::unordered_set<T> my_set = ...;
std::vector<T> my_vec{my_set.begin(), my_set.end(), my_set.size()};
std::sort(my_vec.begin(), my_vec.end());

In your case with std::unordered_map<int, std::string> DB;, I'd suggest copying only the int keys to a vector for sorting, then during iteration look up each key in the unordered_map: that will avoid a lot of string copying.

(It is sometimes possible to orchestrate an unordered container with ordering by key (e.g. hash function returns key, container presized so max bucket index >= max key value) but anyone considering such abuse would be better off using a vector.)

@Michael Repucci 2016-08-31 01:49:58

This should be the answer, as it contains more complete information.

@Kerrek SB 2014-06-13 19:38:17

Sorting only makes sense for sequence containers, which are containers whose elements are determined by the order in which they were added to the container. The dynamic sequence containers in the standard library are vector, deque, list and forward_list.

Maps and sets, on the other hand, are associative containers, in which elements are identified by their value. Thus it makes no sense to ask for an "ordering", since the container elements aren't arranged in any kind of sequence. (It's true that an ordered map can be iterated in a comparison order on the key, but that order emerges from the container; it is not provided by the user.)

@Marco A. 2014-06-13 19:13:27

unordered containers store internally hashed data and thus it's not possible to order them after the hash has been generated.

In order to sort the data you can use an additional non-hashed container (e.g. map or set) and either use them along with the unordered version (so you can use the normal one to sort the data and the unordered one to have fast per-item access) or you can do something like

std::map<int, int> ordered(unordered.begin(), unordered.end());
for(auto it = ordered.begin(); it != ordered.end(); ++it)
     std::cout << it->second;

I recommend not to do the above often (unordered containers have slow sequential access)

https://stackoverflow.com/a/6212709/1938163

Related Questions

Sponsored Content

14 Answered Questions

[SOLVED] Why does the C++ STL not provide any "tree" containers?

10 Answered Questions

10 Answered Questions

[SOLVED] How to sort in-place using the merge sort algorithm?

2 Answered Questions

2 Answered Questions

[SOLVED] How does C++ STL unordered_map resolve collisions?

4 Answered Questions

[SOLVED] Generic hash for tuples in unordered_map / unordered_set

2 Answered Questions

[SOLVED] How to implement classic sorting algorithms in modern C++?

1 Answered Questions

[SOLVED] Sorting on unordered_sets

Sponsored Content