STXXL  1.4-dev
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Macros Groups Pages
Priority Queue
Author
Roman Dementiev (2006)

A priority queue is a data structure that provides a restricted subset of container functionality: it provides insertion of elements, and inspection and removal of the top element. It is guaranteed that the top element is the largest element in the priority queue, where the function object Cmp is used for comparisons. Priority queues do not allow iteration through its elements.

External memory priority queues are the central data structures for many I/O efficient graph algorithms ([58] [21] [40]). The main technique in these algorithms is time-forward processing ([21] [10]), easily realizable by an I/O efficient priority queue. I/O efficient priority queues also find application in large-scale discrete event simulation and online sorting. The STXXL implementation of priority queues is based on [50]. An operation of this priority queue, called sequence heap, takes $ \mathcal{O}(\frac{1}{B}\log_{M/B}(I/B)) $ amortized I/Os, where $ I $ is the total number of insertions into the priority queue. This queue needs less than a third of I/Os used by other similar cache (I/O) efficient priority queues (e.g. [15] [30]). The amortized run time of the STXXL priority queue are

operation internal work I/O (amortized)
insertion $ \mathcal{O}(\log I) $ $ \mathcal{O}(1/B) $
deletion $ \mathcal{O}(\log I) $ $ \mathcal{O}(1/B) $

where $ I $ is the number of performed operations.

A sequence heap maintains R merge groups $ G_1,\ldots, G_R $ where $ G_i $ holds up to k sorted sequences of size up to $ m k^{i-1} $, $ m << M $, see following figure. When group $ G_i $ overflows, all its sequences are merged, and the resulting sequence is put into group $ G_{i+1} $. Each group is equipped with a group buffer of size m to allow batched deletion from the sequences. The smallest elements of these buffers are deleted in small batches and stored in the deletion buffer. The elements are first inserted into the insertion priority queue. On deletion, one checks the minimum elements stored in the insertion priority queue and the deletion buffer.

The difference of our implementation to [50] is that a number of larger merge groups are explicitly kept in external memory. The sorted sequences in those groups only hold their first blocks in the main memory. The implementation supports parallel disks and overlaps I/O and computation. As in [50], the internal merging is based on loser trees [38]. However, our implementation does not use sentinel elements.

san00b_pqueue_small.png
The structure of STXXL priority queue

stxxl::PRIORITY_QUEUE_GENERATOR

Since the stxxl::priority_queue has many setup parameters (internal memory buffer sizes, arity of mergers, number of internal and external memory merger groups, etc.) which are difficult to guess, STXXL provides a helper meta template program that searches for the optimum settings for user demands. This is necessary to limit the amount of main memory occupied by the different buffers of the data structure. The program is called stxxl::PRIORITY_QUEUE_GENERATOR.

The stxxl::PRIORITY_QUEUE_GENERATOR has the following template parameters:

Template Parameters
ValueTypetype of the contained objects (POD with no references to internal memory)
CompareTypethe comparator type used to determine whether one element is smaller than another element.
IntMemoryupper limit for internal memory consumption in bytes.
MaxItemsupper limit for number of elements contained in the priority queue (in 1024 units).
Example: if you are sure that priority queue contains no more than one million elements in a time, then the right parameter is (1000000 / 1024) = 976.
Tunetuning parameter for meta-program search.
Try to play with it if the code does not compile (larger than default values might help). Code does not compile if no suitable internal parameters were found for given IntMemory and MaxItems. It might also happen that given IntMemory is too small for given MaxItems, try larger values.

Notes

  • If CompareType(x,y) is true, then x is smaller than y. The element returned by Q.top() is the largest element in the priority queue. That is, it has the property that, for every other element x in the priority queue, CompareType(Q.top(),x) is false. CompareType must also provide min_value() method, that returns value of type ValueType that is smaller than any element of the queue x, i.e. CompareType(CompareType.min_value(),x) is always true.
    Example: comparison object for priority queue where top() returns the smallest contained integer:
    struct CmpIntGreater
    {
    bool operator () (const int & a, const int & b) const { return a > b; }
    int min_value() const { return std::numeric_limits<int>::max(); }
    };
    Example: comparison object for priority queue where top() returns the largest contained integer:
    struct CmpIntLess
    {
    bool operator () (const int & a, const int & b) const { return a < b; }
    int min_value() const { return std::numeric_limits<int>::min(); }
    };
    Note that CompareType must define strict weak ordering. (see what it is)
  • stxxl::PRIORITY_QUEUE_GENERATOR is template meta program that searches for 7 configuration parameters of stxxl::priority_queue that both minimize internal memory consumption of the priority queue to match IntMemory and maximize performance of priority queue operations. Actual memory consumption might be larger (use stxxl::priority_queue::mem_cons() method to track it), since the search assumes rather optimistic schedule of push'es and pop'es for the estimation of the maximum memory consumption. To keep actual memory requirements low increase the value of MaxItems parameter.
  • For functioning a priority queue object requires two pools of blocks (See constructor of stxxl::priority_queue). To construct STXXL block pools you might need block_type that will be used by priority queue. Note that block's size and hence it's type is generated by the stxxl::PRIORITY_QUEUE_GENERATOR in compile type from IntMemory, MaxItems and sizeof(ValueType) and not given directly by user as a template parameter. Block type can be extracted as stxxl::PRIORITY_QUEUE_GENERATOR<>::result::block_type (see STXXL Priority Queue).
  • The configured priority queue type is available as stxxl::PRIORITY_QUEUE_GENERATOR<>::result.

Internal Memory Consumption of stxxl::priority_queue

Internal memory consumption of stxxl::priority_queue is bounded by the IntMemory template parameter in most situations.