Research Interests

The main areas of interest of ADG members are:

Algorithm Engineering

The process of transferring theoretically efficient algorithms into useful software is very often time-consuming and error-prone, since such algorithms are usually based on complex (abstract) data types and also build upon other algorithms. Experience has shown that it requires training in algorithmic methods to correctly and efficiently implement algorithms from descriptions in the literature. The newly-popular area of algorithm engineering includes the implementation, experimental testing, and fine-tuning of theoretically efficient algorithms to the point where they can be usefully applied in practice. It also includes producing software repositories and platforms that allow algorithm designers to easily implement and experimentally test their ideas. (Iliopoulos, Klasing, RadzikRahman)


 

Approximation Algorithms for Combinatorially Hard Problems

The goal of this research is the development of algorithms for the solution of hard (e.g. NP-hard) problems. The first focus lies in the development of approximative methods and in the proving of the non-existence of effective approximation algorithms. The second focus lies in the development of known approximation algorithms for various tasks from practice. Apart from classical difficult tasks, as constituted by different satisfiability problems, especially problems from molecular biology will be considered. (Gibbons, Klasing)


Architecture-aware Algorithms

The unifying theme in this research is that algorithms are usually designed in a model which offers a certain level of abstraction. As today's computer systems become ever more complicated, these high-level algorithms can be significantly outperformed by algorithms which are more tuned to the underlying hardware.

For instance, algorithms for sorting or data structures for ordered keys may use the comparison-based model, where the order of two keys can only be determined by comparing them. However, if we make reasonable assumptions about the keys, i.e., that they are either integers or floating-point numbers, then we can use the more concrete random-access machine (RAM) model. The RAM model postulates unit cost for simple operations on word-sized integers, and hence pre-supposes a certain amount of reudimentary parallelism within the CPU (to operate on all the bits in a word simultaneously). By exploiting this word-level parallelism (WLP), we have obtained faster sequential algorithms for fundamental problems such as sorting, searching and shortest paths: for instance, we can sort n integers or floating-point numbers in O(n log log n) time, which is considerably faster (in asymptotic terms) than comparison-based algorithms. The theoretical aspects of this research can be viewed as an offshoot of fundamental studies on the power of the RAM model and the related cell-probe model. In a current EPSRC-funded research project we will carry further the theoretical work on the use of WLP in core data structuring problems, and also to develop portable software which uses WLP to obtain significant performance gains (see algorithm engineering). Another strand of research is to use WLP in string algorithms (the so-called shift-OP algorithms), see string algorithmics. (Iliopoulos, Kurokawa)


Internal Memory Algorithms

Most algorithms and data structures are analysed on the RAM model which assumes that the CPU can read a memory location in the same time that it takes to add two word-sized integers. Due to the differences in speed between CPU and memory nowadays, this is a very bad assumption. Computer systems exploit locality of reference in programs to bridge this gap using caches. Unfortunately, RAM algorithms often have poor locality of reference; our work involves mathematically analysing the cache performance of specific algorithms as well as modifying them to improve their locality.

Using our analysis we have developed very fast cache-efficient sorting algorithms. One of the algorithms we have developed, EPLSB radix sort, appears to be one of the fastest sorting algorithms for sorting integer and floating-point keys. Our experiments suggest that it is two and a half times as fast as memory-optimised Quicksort. (Rahman)


Combinatorial Optimisation Algorithms

Our research in this field is concentrated on network optimization problems, which are among the classical combinatorial optimization problems. They arise from numerous applications in areas as different as manufacturing, design of communication networks and financial analysis. The main network optimization problems we investigate are: the minimum cost network flow problem, the generalized network flow problem, the multicommodity flow problem and various parametric network flow problems. We also study scheduling and related graph colouring problems. In the early 80's a new area of research in the network optimization, and more generally in combinatorial optimization, started with the designing of polynomial time algorithms for the linear programming problem by Khachian and Karmarkar and a strongly polynomial time algorithm for the minimum cost network flow problem by Tardos. The interplay of the ideas contained in these seminal works have already led to many interesting theoretical results concerning network optimization problems and to algorithms which are capable of solving much larger input instances than the previously known methods were. Following this general line of research, we work on developing even more efficient algorithms and on evaluating the existent ones. Our approach is to combine the algebraic tools contained in Karmarkar's interior point method with thorough investigation of the combinatorial structure of the underlying networks. Our work involves research on related combinatorial optimization problems and some relevant graph problems. (Gibbons, Radzik, Sant)


Communication Algorithms in Networks

The goal of this research is to investigate different network types as candidates for parallel architectures or communication networks. On the one hand, different networks are compared using graph-theoretic techniques, e.g. by constructing embeddings between networks. On the other hand, the quality of a network is measured by the level of efficiency of performing basic communication primitives, like broadcasting, routing, gossiping and accumulation. As a result of the work, we want to develop effective algorithms for communication in networks, prove their efficiency and optimality, if possible (by deriving upper and lower bounds for suitably chosen cost measures), and finally give recommendations for interconnection structures with outstanding communication properties. (Gibbons, Justo, Klasing)


Data Structures and Algorithms for Dynamic Problems

Dynamic Graph Algorithms

To meet the increasing demands for computing speed in a society pervaded with the use of computers in all aspects of life, efficient algorithms (and data structures) are needed to design and manage complex physical systems. Many such systems are modeled by combinatorial structures (e.g., graphs) and in many cases (real-time systems) the input data change over time, thus giving rise to dynamic problems and dynamic algorithm design. We are particularly interested in developing fast algorithms and their implementations for dynamic graph problems, including various graph connectiveity problems and minimum spanning tree problem. (Radzik)

RAM Data Structures

See architecture-aware algorithms.

Self-adjusting data structures

We investigate the use of self-adjusting data structures such as splay trees in dynamic graph algorithms. We have applied this approach successfully to obtain efficient software for the dynamic minimum spanning tree problem (see algorithm engineering). (Radzik)


Parallel and High Performance Computing

Our research ranges from developing algorithms for the abstract PRAM (parallel random-access machine) model, to developing realistic programmers' models for the current generation of massively-parallel machines, developing algorithms on these models and experimentally evaluating them on real parallel machines.

High-Performance Computing

High-performance computing (HPC) requires the design and development of efficient parallel algorithms which are well-matched to +particular classes of machine architecture. Recent work in this area has resulted in new and improved kernel algorithms for sorting, power series evaluation, 'direct' evaluation of linear recurrences, and 'fast' matrix multiplication.

The practical performance of these algorithms is being carefully studied in absolute, relative, and comparative terms, when implemented on a variety of high-performance parallel computers including the Convex C3860 and Cray Y-MP/416 vector multiprocessors, the Fujitsu VPX240/10 vector processor, the Thinking Machines CM-200 and MasPar MP-1 processor arrays, the Intel iPSC/860 hypercube multiprocessor, the KSR1 virtual shared memory multiprocessor, and

A third strand of this research involves theoretical performance modeling of such parallel computers. It includes a comparative study of the criteria for cost-effective use of hypercube architectures, and the development of parameters indicating the conditions under which a massively parallel processor (MPP) is likely to yield improved performance over a parallel vector computer (PVC) in the solution of a problem. (Overill)

PRAM Algorithms

The PRAM is an abstract model of parallel computation which postulates a potentially unbounded number of synchronous processors which communicate by accessing a shared memory. Its simplicity and power make it an ideal platform on which to conceptualise parallel algorithms, and to study the intrinsic parallel complexity of problems. Our research has concentrated on determining the parallel complexity of fundamental problems in ordering, graph and string problems. (Gibbons, Iliopoulos, Radzik)

Models and Algorithms for MPP

Massively-parallel computers that consist of hundreds or thousands of off-the-shelf processors connected by a communication network have become steadily easier to build and have much better price-performance ratios than traditional vector supercomputers. However, getting them to perform +anywhere near as fast as their nominal specification is tremendously difficult, especially for so-called irregularly structured problems; making the resulting finely-tuned code portable is even more difficult. The best solution to this parallel software crisis appears to be to develop bridging models of parallel computation and to develop algorithms for these models. Our efforts have concentrated on developing algorithms for important problems such as the JOIN operation for relational databases on some such bridging models that have been recently proposed. (Gibbons)


Random Structures and Random Algorithms

Random graph models for the web, peer-to-peer and ad-hoc networks. (Cooper)


 

String Algorithmics

This research concentrates on algorithmic and combinatorial problems arising in applications which involve extensive manipulations of strings. The proposed research has both commercial and scientific applications. Examples of commercial applications include large database problems (duplicate record removal, pattern detection, erroneous (misspelled) records), speech recognition, text editing and processing, data compression and encoding. (Christodoulakis, Crochemore, Iliopoulos, Kurokawa, Mohamed, Mouchard, Pinzon )

Computer Assisted Music Analysis

Musical analysis is commonly focused on identifying and discovering musical patterns that recur within a musical score – with or without variations. A musical score can be viewed as a string or collection of strings whereby each string is composed of symbols such as notes or musical intervals in the chromatic or diatonic notation. String processing algorithms are traditionally used both for pattern recognition, i.e. for finding all the instances of a given pattern in a given string, or for pattern induction, i.e. for extracting significant patterns that recur within a given string. Such algorithms are commonly employed on text strings, DNA strings and indeed can be applied on musical strings. Recent research conducted in the domain of Musical Pattern Processing is presented in papers such as (Crawford, Iliopoulos and Raman, 1998; Cambouropoulos, Crawford and Iliopoulos, 1999; Camboulopoulos, Crochemore, Iliopoulos, Mouchard and Pinzon, 1999; Rolland, 1999; Rolland, Raskinis and Ganascia, 1999).

We are aiming to design and implement efficient algorithms for several variants of monophonic and polyphonic motif matching in musical sequences:

  • Approximate motif matching
  • Compact motif matching
  • Rhythmic motif matching
  • Metrical congruity motif matching
  • Salient motif matching
  • Harmonic progression motif matching

(Gibbons, Iliopoulos, Kurokawa)

Algorithms for Molecular Sequences Analysis

The Human Genome project along with molecular biology technological improvements have resulted in vast numbers of longer sequences, hence any string manipulation algorithms need to be efficient. Two important goals in computational molecular biology are finding regularities in nucleic or protein sequences, and finding features that are common to a set of such sequences.

This research aims to develop combinatorial approaches to deal with series of pattern inference problems that concern: inferring very long patterns allowing high error rate – finding a set of patterns that covers exactly or possibly with some overlap, a set of sequences – introducing phylogenetic metric – inferring mixed sequences and structural patterns – improve algorithm and data structure (such as suffix tree or automata) – extends theoretical results concerning repeats and patterns known from combinatorics on words to the case of approximate models.(Iliopoulos, Mohamed, Pinzon)

Data Compression

This research will focus on ways to manipulate compressed data without decompressing it. The research is oriented also towards the design of efficient methods for compressing data, i.e., reducing the amount of space consumption in computer storage as well as minimizing the amount of information transmitted in communications systems, therefore saving time and space.

While the main thrust of the research is theoretical, its motivation is practical and directed towards the implementation of software. Hence it is important to verify the applicability of the methods developed in a practical context. We believe that the problem of pattern-matching in compressed files is a challenging one for which no completely satisfactory approach has yet been found and there is scope for further research on this issue.

The current proposal indicates how to pursue such research. Old methods are to re-examined and a variety of new ones should be designed. The methodology of the proposed work involves several streams of activity:

  • Searching for explicitly given uncompressed patterns in compressed texts
  • Searching for compressed patterns
  • Compression issues in music data
  • Compression based on combinatorics of text regularities
  • Space complexity problems
  • Burrows-Wheeler Transform and compression by sorting
  • Trees related to compression

(Christodoulakis, Gibbons, Iliopoulos)