Installation, usage, configuration (Linux/Unix – g++/icpc/clang++)

Download and Extraction

Library Compilation

Building an Application

After compiling the library, some Makefile variables are written to ( if you have built with parallel mode, if you have built with MCSTL) in your STXXL_ROOT directory. This file should be included from your application's Makefile.

The following variables can be used:

An example Makefile for an application using STXXL:

STXXL_ROOT      ?= .../stxxl

# use the variables from
CXX              = $(STXXL_CXX)
LDLIBS          += $(STXXL_LDLIBS)

# add your own optimization, warning, debug, ... flags
# (these are *not* set in
CPPFLAGS        += -O3 -Wall -g -DFOO=BAR

# build your application
# (my_example.o is generated from my_example.cpp automatically)
my_example.bin: my_example.o
	$(CXX) $(CXXFLAGS) $(CPPFLAGS) $(LDFLAGS) my_example.o -o [email protected] $(LDLIBS)

Enabling parallel execution

To enable (shared-memory-)parallel execution of internal computation (in fact, sorting and merging, and random shuffling), you have several options depending on the compiler version used:

We recommend to try the first option at first.

The number of threads to be used can be set by the environment variable OMP_NUM_THREADS or by calling omp_set_num_threads. Detailed tuning can be achieved as described here.

Disk space

Before you try to run one of the STXXL examples (or your own STXXL program) you must configure the disk space that will be used as external memory for the library.

To get best performance with STXXL you should assign separate disks to it. These disks should be used by the library only. Since STXXL is developed to exploit disk parallelism, the performance of your external memory application will increase if you use more than one disk. But from how many disks your application can benefit depends on how "I/O bound" it is. With modern disk bandwidths of about 50-75 MiB/s most of applications are I/O bound for one disk. This means that if you add another disk the running time will be halved. Adding more disks might also increase performance significantly.

Recommended file system

The library benefits from direct transfers from user memory to disk, which saves superfluous copies. We recommend to use the XFS file system, which gives good read and write performance for large files. Note that file creation speed of XFS is a bit slower, so that disk files should be precreated for optimal performance.

If the filesystems only use is to store one large STXXL disk file, we also recommend to add the following options to the mkfs.xfs command to gain maximum performance:

-d agcount=1 -l size=512b 

The following filesystems have been reported not to support direct I/O: tmpfs , glusterfs . Since direct I/O is enabled by default, you may recompile STXXL with STXXL_DIRECT_IO_OFF defined to access files on these file systems.

Disk configuration file

You must define the disk configuration for an STXXL program in a file named '.stxxl' that must reside in the same directory where you execute the program. You can change the default file name for the configuration file by setting the environment variable STXXLCFG .

Each line of the configuration file describes a disk. A disk description uses the following format:

Description of the parameters:

See also the example configuration file 'config_example' included in the tarball.

Log files

STXXL produces two kinds of log files, a message and an error log. By setting the environment variables STXXLLOGFILE and STXXLERRLOGFILE, you can configure the location of these files. The default values are stxxl.log and stxxl.errlog, respectively.

Precreating external memory files

In order to get the maximum performance one should precreate disk files described in the configuration file, before running STXXL applications.

The precreation utility is included in the set of STXXL utilities ( utils/createdisks.bin ). Run this utility for each disk you have defined in the disk configuration file:

utils/createdisks.bin capacity full_disk_filename...