STXXL
1.4-dev
|
A priority queue is a data structure that provides a restricted subset of container functionality: it provides insertion of elements, and inspection and removal of the top element. It is guaranteed that the top element is the largest element in the priority queue, where the function object Cmp is used for comparisons. Priority queues do not allow iteration through its elements.
External memory priority queues are the central data structures for many I/O efficient graph algorithms ([58] [21] [40]). The main technique in these algorithms is time-forward processing ([21] [10]), easily realizable by an I/O efficient priority queue. I/O efficient priority queues also find application in large-scale discrete event simulation and online sorting. The STXXL implementation of priority queues is based on [50]. An operation of this priority queue, called sequence heap, takes amortized I/Os, where is the total number of insertions into the priority queue. This queue needs less than a third of I/Os used by other similar cache (I/O) efficient priority queues (e.g. [15] [30]). The amortized run time of the STXXL priority queue are
operation | internal work | I/O (amortized) |
---|---|---|
insertion | ||
deletion |
where is the number of performed operations.
A sequence heap maintains R merge groups where holds up to k sorted sequences of size up to , , see following figure. When group overflows, all its sequences are merged, and the resulting sequence is put into group . Each group is equipped with a group buffer of size m to allow batched deletion from the sequences. The smallest elements of these buffers are deleted in small batches and stored in the deletion buffer. The elements are first inserted into the insertion priority queue. On deletion, one checks the minimum elements stored in the insertion priority queue and the deletion buffer.
The difference of our implementation to [50] is that a number of larger merge groups are explicitly kept in external memory. The sorted sequences in those groups only hold their first blocks in the main memory. The implementation supports parallel disks and overlaps I/O and computation. As in [50], the internal merging is based on loser trees [38]. However, our implementation does not use sentinel elements.
Since the stxxl::priority_queue has many setup parameters (internal memory buffer sizes, arity of mergers, number of internal and external memory merger groups, etc.) which are difficult to guess, STXXL provides a helper meta template program that searches for the optimum settings for user demands. This is necessary to limit the amount of main memory occupied by the different buffers of the data structure. The program is called stxxl::PRIORITY_QUEUE_GENERATOR.
The stxxl::PRIORITY_QUEUE_GENERATOR has the following template parameters:
ValueType | type of the contained objects (POD with no references to internal memory) |
CompareType | the comparator type used to determine whether one element is smaller than another element. |
IntMemory | upper limit for internal memory consumption in bytes. |
MaxItems | upper limit for number of elements contained in the priority queue (in 1024 units). Example: if you are sure that priority queue contains no more than one million elements in a time, then the right parameter is (1000000 / 1024) = 976. |
Tune | tuning parameter for meta-program search. Try to play with it if the code does not compile (larger than default values might help). Code does not compile if no suitable internal parameters were found for given IntMemory and MaxItems. It might also happen that given IntMemory is too small for given MaxItems, try larger values. |
CompareType(x,y)
is true, then x is smaller than y. The element returned by Q.top()
is the largest element in the priority queue. That is, it has the property that, for every other element x in the priority queue, CompareType
(Q.top(),x) is false. CompareType
must also provide min_value()
method, that returns value of type ValueType that is smaller than any element of the queue x, i.e. CompareType
(CompareType.min_value(),x) is always true. top()
returns the smallest contained integer: top()
returns the largest contained integer: CompareType
must define strict weak ordering. (see what it is)IntMemory
and maximize performance of priority queue operations. Actual memory consumption might be larger (use stxxl::priority_queue::mem_cons() method to track it), since the search assumes rather optimistic schedule of push'es and pop'es for the estimation of the maximum memory consumption. To keep actual memory requirements low increase the value of MaxItems
parameter.IntMemory
, MaxItems
and sizeof(ValueType)
and not given directly by user as a template parameter. Block type can be extracted as stxxl::PRIORITY_QUEUE_GENERATOR<>::result::block_type (see STXXL Priority Queue).Internal memory consumption of stxxl::priority_queue is bounded by the IntMemory template parameter in most situations.