Department of Computer Science
The Strand, London WC2R 2LS, UK
  Undergraduate enquiries:
Postgraduate enquiries:
Fax:
+44 (0)20 7848 2005
+44 (0)20 7848 2588
+44 (0)20 7848 2851
 
 



2009/10 BSc/MSci individual projects

Supervisor: Maxime Crochemore


Project titles

The project development platforms are to be discussed with the supervisor.


Detailed Descriptions

Compact display of strings

The aim of this project is to develop a software able to display strings of symbols in a compact manner. For example, string "aaaaaaa" should appear as "a^7", string "abbbbbcebbbbbc" should appear as "ab^5ceb^5c", and string "abbbbbcabbbbbc" should appear as "(ab^5c)^2". The principle is to emphasize repetitions or runs occurring in the input string. The question is somehow related to run-length encoding used in data compression systems.

Computing Longest Previous reverse Factors

The aim of this project is to implement an algorithm that reports the previous occurrence of reverse words in a text. It builds a table called LPrF, on a string of symbols x, defined by: for each position i on x, LPrF[i] is the maximal length of strings occurring at i as well as in reverse x[0..i-1]. The question is related to text compression systems since the second occurrence of a reverse word can be replaced by just a pointer on its previous occurrence, saving then large portions of text. It is also an important element to locate certain types of palindromes in sequences. An article describes the algorithm, and simpler methods are to be implemented to compare the efficiency of all of them.

Best ranking

The aim of this project is to test heuristics to find the center of three permutations, called the Best sorting. When we have three permutations, associated to ranking made by three persons for example, we want to have the median ranking defined according to some criteria. A paper presenting the problem is provided.

Largest similarities between two texts

The aim of this project is to develop a system which would report the largest segments appearing in two different texts written in natural language. Several parameters can be considered according to a threshold length of segments or to the similarity between segments. A preprocessing of the texts can transform them into sequences of integers (by hashing) in order to simplify further word comparisons.

Your own project idea

You can submit your own project ideas in the following domains: algorithms and data structures, data compression, natural language analysis, bioinformatics, automata and formal languages, pattern matching, games, etc.


Computer Science Homepage     Local Homepage