STXXL  1.4-dev
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Macros Groups Pages
STXXL Algorithms
Author
Roman Dementiev (2006)

Iterators of stxxl::vector are STL compatible. stxxl::vector::iterator is a model of Random Access Iterator concept from STL. Therefore it is possible to use the stxxl::vector iterator ranges with STL algorithms. However, such use is not I/O efficient if an algorithm accesses the sequence in a random order. For such kind of algorithms STXXL provides I/O efficient implementations described on this page. If an algorithm does only a scan (or a constant number of scans) of a sequence (or sequences) the implementation that calls STL algorithm is nevertheless I/O efficient. However, one can save constant factors in I/O volume and internal work if the the access pattern is known (read-only or write-only scan for example). This knowledge is used in STXXL specialized implementations of STL algorithms.

Example: STL Algorithms Running on STXXL containers (do not do this, read below!)

// Replace every number in an array with its negative.
const int N = 1000000000;
vector_type A(N);
std::iota(A.begin(), A.end(), 1);
std::transform(A, A+N, A, negate<double>());
// Calculate the sum of two vectors,
// storing the result in a third vector.
const int N = 1000000000;
vector_type V1(N);
vector_type V2(N);
vector_type V3(N);
std::iota(V1.begin(), V1.end(), 1);
std::fill(V2.begin(), V2.end(), 75);
assert(V2.size() >= V1.size() &&
V3.size() >= V1.size());
std::transform(V1.begin(),
V1.end(),
V2.begin(),
V3.begin(),
plus<int>());

The algorithms of the STL can be divided into two groups by their memory access pattern: scanning algorithms and random access algorithms.

Scanning Algorithms

Scanning algorithms work with Input, Output, Forward, and Bidirectional iterators only. Since random access operations are not allowed with these kinds of iterators, the algorithms inherently exhibit a strong spatial locality of reference. STXXL containers and their iterators are STL-compatible, therefore one can directly apply STL scanning algorithms to them, and they will run I/O-efficiently (see the use of std::generate and std::unique algorithms in the listing above).

Scanning algorithms are the majority of the STL algorithms (62 out of 71). STXXL also offers specialized implementations of some scanning algorithms, which perform better in terms of constant factors in the I/O volume and internal CPU work. These implementations benefit from accessing lower level interfaces of the BM layer instead of using iterator interfaces, resulting in a smaller CPU overhead. Being aware of the sequential access pattern of the applied algorithm, the STXXL implementations can do prefetching and use queued writing, thereby leading to the overlapping of I/O with computation.

STXXL provides the following scanning algorithms:

Random Access Algorithms

Random access algorithms require random access iterators, hence may perform (many) random I/Os. For such algorithms, STXXL provides specialized I/O efficient implementations that work with STL-user layer external memory containers. Currently, the library provides two implementations of sorting:

Both sorters are implementations of parallel disk algorithms described in Parallel Disk Sorting [26].