Abstract:
Virtual memory, even on the largest and fastest contemporary computers, is neither large enough nor fast enough for all applications. Some data structures must be held in the file system, and some ‘performance hints’ must be given to the memory-management runtime routines. For these reasons, most large-memory application codes are littered with system-specific names, constants, file-migration policies, pragmas and hints. Such codes are very difficult to develop, maintain, and port. I propose a new paradigm for the design of large-memory codes, providing many of the performance advantages and few of the drawbacks of system-specific coding techniques. In my paradigm, programmers must organise their data structures into a series of (nesting) data blocks, with blocksizes Bh increasing in a fractal (power-law) progression [see formula in pdf]. Furthermore, the larger blocks must be referenced much less frequently than the smaller blocks. I argue that efficient algorithms for several important problems on workstations and PCs are found at ∂≈3/2 and R=8. I sketch a model of memory performance that explains why non-hierarchical large-memory codes require system-specific tuning for efficient execution.