Sergio De Agostino (Sapienza University)
Parallel Implementations of Dictionary Text Compression without Communication
Abstract:
Static and sliding dictionary text compression methods are efficiently parallelizable.
Approximation schemes for optimal compression with static and sliding
dictionaries which can run on a simple array of processors with
distributed memory and no interconnections are presented.
In virtue of their scalability, these approximation algorithms can be
implemented on both small and large scale parallel systems.
Due to its adaptiveness, the sliding dictionary method requires
large size files on large scale systems.
Andreas A Albrecht (Queen's University Belfast)
On the Number of Local Maxima in MAX-SAT Instances
Abstract:
When dealing with individual instances of hard problems, gathering information
about specific properties of instances in a pre-processing phase might
be helpful for an appropriate adjustment of local search-based procedures.
We analyse the approach in the context of landscapes induced by k-SAT instances,
where the objective function is defined by the number of satisfied k-clauses.
Firstly, we provide a method for obtaining upper bounds for the average
number of local maxima in k-SAT instances. The method allows us to analyse the
number of local maxima for different relations between k, n, and m, for m being the
total number of k-clauses. Secondly, we utilise a sampling method, which was devised
by Garnier and Kallel in 2002, for approximations of the number of local maxima in
landscapes generated by individual k-SAT instances and a simple neighbourhood relation.
The results are then compared to the predicted average number of local maxima.
Ljiljana Brankovic (University of Newcastle)
A Dynamic Programming Algorithm for Measuring Privacy
Abstract:
Data mining and statistical analysis are now becoming a matchless tool
for research and strategic planning and are used by companies,
governments and research institutions alike. They both depend on
massive databases often containing personal information and although
only aggregate values and patterns are normally made available to
users, it may still be possible for a clever intruder to disclose
confidential individual values. There are numerous disclosure control
techniques for alleviating a threat of such an unwanted disclosure. In
this talk we introduce a dynamic programming algorithm for evaluating
confidentiality of a dataset protected with a particular technique and
we use it to compare various disclosure control methods.
Colin Cooper (King's College London)
Multiple random walks in random regular graphs
In this talk we study problems arising from multiple random walks
under various assumptions about the interaction between the
particles. The random walks we consider make their transitions uniformly
to neighbouring vertex at discrete time steps.
On meeting at a vertex the walks may interact, but their
behaviour is otherwise independent.
Topics include:
Cover time of a graph with multiple random walks,
Expected time to broadcast a message
Expected time coalesce to a single walk.
To give precise results, we make our analysis on
random r-regular graphs, although this is mainly for convenience.
As an example:
If the particles coalesce on meeting, then if one particle starts at
each vertex, the expected time to coalesce to a single walk is
2n({r-1})/({r-2}), where r is the vertex degree.
By a time reversal argument, this is the expected time for the
completion of the voter problem, which is defined as follows:
Initially each vertex has a distinct opinion. At each time step each
vertex i contacts a random neighbour j and adopts the opinion
of j. Thus 2n({r-1})/({r-2}) is the expected number of steps
until a single opinion emerges.
Tomas Flouri (Czech Technical University in Prague)
Tree Pattern Matching by Pushdown Automata
Abstract:
Tree pattern matching is an important operation in Computer Science on
which a number of tasks such as mechanical theorem proving, term-rewriting,
symbolic computation and non-procedural programming languages are based on.
A systematic approach to the construction of tree pattern matchers by
deterministic pushdown automata which read subject trees in prefix notation,
is presented. The method is analogous to the construction of string pattern
matchers: for given patterns, a non-deterministic pushdown automaton is
created and then it is determinised. In addition, it is shown that the size
of the resulting deterministic pushdown automata directly corresponds to the
size of the existing string pattern matchers based on finite automata.
Leszek Gąsieniec (University of Liverpool)
Ondřej Guth (Czech Technical University in Prague)
Searching Approximate Covers of Strings using Finite Automata
Abstract:
Cover
is a type of regularity of a string. A restricted approximate cover w
of string T is a factor of T such that every position of T lies within
some approximate occurrence of w in T. We present algorithm for
searching all restricted approximate covers of a string and we show a
way how to extend it to searching all nonrestricted approximate covers.
The solution is based on finite automata approach, that provide common
formalism and a straightforward way to design algorithms to many
problems in stringology.
Jan Holub (Czech Technical University in Prague)
Tuning BNDM with q-Grams
We develop bit-parallel algorithms for exact string matching. Our
algorithms are variations of the BNDM and Shift-Or algorithms. At each
alignment the algorithms read a $q$-gram before testing the state
variable. In addition we apply reading a 2-gram in one instruction. Our
experiments show that many of the new variations are substantially faster
than any previous string matching algorithm on x86 processors for English
and DNA data.
Jan Janousek (Czech Technical University in Prague)
Subtree and Tree Pattern Pushdown Automata for Trees in Prefix Notation
Abstract:
Two new kinds of acyclic pushdown automata for trees in prefix notation
are presented. First, \emph{subtree pushdown automata} accept all subtrees
of a tree. Second, \emph{tree pattern pushdown automata} accept all tree
patterns which match a tree. The presented pushdown automata are
input--driven and therefore can be determinised. Given a tree of size $n$,
the deterministic subtree or the deterministic tree pattern pushdown
automaton can be constructed for this tree in time linear in $n$ and its
advantages are: First, the automaton represents a complete index of the
tree and the search phase of all occurences of a subtree or a tree
pattern of size $m$ is performed in time linear in $m$ and not depending
on $n$. This is faster than the time of the existing tree pattern matching
algorithms, which depends on $n$. Second, although the number of distinct
tree patterns matching the tree can be exponential in $n$, the total size
of the automaton is linear in $n$.
Dennis Komm (ETH Zurich)
Reoptimization of the TSP with Deadlines in Metric Graphs
Abstract:
The reoptimization version of an optimization problem deals with the
following scenario: Given an input instance together with an optimal
solution for it, the objective is to find a high-quality solution for a
locally modified instance.
In this talk, we investigate several reoptimization variants of the
traveling salesman problem with deadlines in metric graphs. The
objective here is to find a minimum-cost Hamiltonian cycle in a
complete undirected graph with a metric edge cost function which visits
some of its vertices before some prespecified deadlines. As types of
local modifications, we consider insertions and deletions of a vertex as
well as of a deadline.
We prove the hardness of all of these reoptimization variants and give
lower and upper bounds on the achievable approximation ratio which are
tight in most cases.
M. Oğuzhan Külekci (Turkey National Research Institute of Electronics & Cryptology)
A Method to Overcome Computer Word Size Limitation in Bit-Parallel Pattern Matching
Abstract:
The performance of the pattern matching algorithms based
on bit-parallelism degrades when the input pattern length exceeds the
computer word size. Although several divide-and-conquer methods have
been proposed to overcome that limitation, the resulting schemes are
not that much efficient and hard to implement. This study introduces
a new fast bit-parallel pattern matching algorithm that is capable of
searching patterns of any length in a common bit-parallel fashion. The
proposed bit-parallel length invariant matcher (BLIM) is compared with
the Shift-Or and bit-parallel non-deterministic matching (BNDM) algo-
rithms along with the standard Boyer-Moore and Sunday’s quick search,
which are known to be the very fast in general. Benchmarks have been
conducted on natural language, DNA sequence, and binary alphabet ran-
dom texts. Besides the length invariant architecture of the algorithm,
experimental results indicate that on the average BLIM is 18%, 44%,
and 6% faster than BNDM, which is accepted as one of the fastest al-
gorithms of this genre, on natural language, DNA sequence and binary
random texts respectively.
Borivoj Melichar (Czech Technical University in Prague)
Arbology, basic definitions
Abstract:
Definitions of tree, subtree, tree pattern, prefix and suffix notations.
Pushdown automaton, determinisation of nondeterministic input-driven
pushdown automaton.
Borivoj Melichar (Czech Technical University in Prague)
Repeats in trees
Abstract:
Prefix notation of a tree. Construction of subtree pushdown automata
and tree pattern pushdown automata. Use of information gained during
determinisation of nondeterministic pushdown automata for finding
repetitions.
Maria Polukarov (University of Southampton)
Congestion games with faulty or asynchronous resources.
Abstract:
We introduce the concepts of resource failures and
asynchronous task execution in congestion games. We present two
models -- congestion games with load-dependent failures (CGLFs) in which resources may fail to execute their assigned tasks with some (congestion-dependent) probability, and asynchronous congestion games (ACGs)
in which resources execute their assigned tasks not simultaneously but
in a randomly chosen order. The random order of task execution
reflects, for instance, a situation where players and resources are the
elements of an asynchronous distributed system, in which each process
has its own independent clock. As it turns out, these new settings lead
to interesting observations about the interplay between the need to
deal with failures or asynchronous nature of processes, and the
emergence of congestion in non-cooperative systems. Indeed, the
classical idea of using several resources in order to overcome the
possibility of failure or to decrease the expected time of task
completion, may result in a high congestion, hurting all agents in the
system. Although, as we show, CGLFs and ACGs do not admit a potential
function and therefore are not isomorphic to classic congestion games,
we prove the existence of a pure strategy Nash equilibrium in the above
classes of games. We also develop polynomial time algorithms for
computing a pure strategy equilibrium in these games.
This is a joint work with Michal Penn and Moshe Tennenholtz from the Technion.
Ida Pu (Goldsmiths, University of London)
Measuring energy-time efficiency of MANET routing protocols
Abstract:
In this talk, I will introduce two metrics for assessment of mobile ad hoc
network performance in terms of energy-time efficiency.
The combined effect of energy consumption and time delay is described as a
trade-off in performance analysis. The metrics have demonstrated a number of
advantages over the conventional ones in which the energy and time were often
considered separately. The proposed new metrics are simple, generic and
flexible. As an application, we have compared the energy-time efficiency of
the Blocking Expanding Ring Search (BERS) and Expanding Ring Search (ERS),
two similar Time to Live (TTL)-based expanding ring search algorithms using
our new metrics. The results show that the new metrics can be applied
efficiently in assessment of different protocols.
Mikaël Salson (University of Rouen)
Dynamic Extended Suffix Arrays
Abstract:
Full-text indexes, such as suffix trees or suffix arrays,
have been intensively studied and many algorithms have been
developed for these structures.
However, these indexes have two main drawbacks:
- they are space-consuming,
- they are not easily updatable.
The former has been adressed first by Kärkkäinen and Ukkonen
in 1996. They proposed a sublinear-size index based on
Lempel-Ziv parsing, since then a lot of work has been done in
the field of compressed indexes. Compressed indexes now achieve
a space close to the compressed text.
Ferragina and Grossi proposed a solution in 1995 for the latter.
This method was later adapted to compressed indexes but remains theoretical.
Considering the Burrows-Wheeler transform, particularly used in data
compression, we propose an algorithm for updating the suffix array to
take into account any edit operations performed in the original text.
We extend this approach to the inverse of the suffix array and the
table of the
longest common prefixes. Our algorithm is also of interest for compressed
full-text indexes based on the Burrows-Wheeler transform.
Bill Smyth (McMaster University)
The Prefix Array & the Prefix Graph
We describe work in progress on the prefix array of a string. We define
strings not on an alphabet Σ, but rather on all nonempty subsets of Σ. Thus
for us a string is what others (and we) have called an ”indeterminate string”. If
in fact the string is defined on subsets of Σ of size 1, we say that it is regular.
We define a feasible array y = y[1..n] of integers to be such that y[1] = n,
while for every i ∈ 2..n, y[i] ∈ 0..n+1−i. We then show that every feasible
array is the prefix array of some string.
In a prefix array y we define positions as normal or k-special. We also define
a prefix graph on positions 1..n in which the k-special positions are the
vertices of cliques. The largest such clique (the clique number of the graph)
determines the minimum alphabet size of the string x that corresponds to y,
in the case that x is regular. We describe an O(n)-time algorithm that
assigns letter k +1 to x[i] whenever i is k-special and also computes the
clique number of the prefix graph. We show that if x is regular, this
assignment yields a lexleast string x whose prefix array is y.
Alexander Tiskin (University of Warwick)
Fast distance multiplication of unit-Monge matrices
Abstract:
A matrix is called Monge, if its density matrix is nonnegative. Monge
matrices play an important role in optimisation. Distance multiplication
(also known as min-plus or tropical multiplication) of two Monge
matrices of size $n$ can be performed in time $O(n^2)$. Motivated by
applications to string comparison, we introduced in a previous work the
following subclass of Monge matrices. A matrix is called unit-Monge, if
its density matrix is a permutation matrix; we further restrict our
attention to a subclass that we call simple unit-Monge matrices. Our
previous algorithm for distance multiplication of simple unit-Monge
matrices runs in time $O(n^{3/2})$. Landau conjectured in 2006 that this
problem can be solved in linear time. In this work, we give an algorithm
running in time $O(n \log^4 n)$, thus approaching Landau's conjecture
within a polylogarithmic factor. The new algorithm implies immediate
improvements in running time for a number of string comparison and graph
algorithms: semi-local longest common subsequences between permutations;
longest increasing subsequence in a cyclic permutation; longest
pattern-avoiding subsequence in a permutation; longest piecewise
monotone subsequence; maximum clique in a circle graph; subsequence
recognition in a grammar-compressed string; sparse spliced alignment of
genome strings.
Gabriel Valiente (Technical University of Catalonia)
Combinatorial Pattern Matching Algorithms in Computational Biology
Abstract:
Combinatorial pattern matching is one of the main sources of
algorithmic solutions to the problems that arise in the analysis of
genomic, transcriptomic, proteomic, metabolomic, and interactomic
data. This talk provides an organized and comprehensive view of the
whole field of combinatorial pattern matching from a computational
biology perspective, and addresses specific pattern matching problems
that arise when dealing with biological sequences (such as DNA and
protein sequences), trees (such as phylogenetic trees and RNA
structures), and graphs (such as phylogenetic networks, metabolic
pathways, protein interaction networks, and signaling pathways). A
clear distinction is made between the problems that arise in the
analysis of one structure (finding patterns within a structure) and
in the comparative analysis of two structures (finding patterns
common to two structures and aligning these structures).