|
|
|||||||||
|
|||||||||
|
|
|||||||||
The project development platforms are to be discussed with the supervisor.
The aim of this project is to develop a software able to display strings of symbols in a compact manner. For example, string "aaaaaaa" should appear as "a^7", string "abbbbbcebbbbbc" should appear as "ab^5ceb^5c", and string "abbbbbcabbbbbc" should appear as "(ab^5c)^2". The principle is to emphasize repetitions or runs occurring in the input string. The question is somehow related to run-length encoding used in data compression systems.
The aim of this project is to implement an algorithm that reports the previous occurrence of reverse words in a text. It builds a table called LPrF, on a string of symbols x, defined by: for each position i on x, LPrF[i] is the maximal length of strings occurring at i as well as in reverse x[0..i-1]. The question is related to text compression systems since the second occurrence of a reverse word can be replaced by just a pointer on its previous occurrence, saving then large portions of text. It is also an important element to locate certain types of palindromes in sequences. An article describes the algorithm, and simpler methods are to be implemented to compare the efficiency of all of them.
The aim of this project is to test heuristics to find the center of three permutations, called the Best sorting. When we have three permutations, associated to ranking made by three persons for example, we want to have the median ranking defined according to some criteria. A paper presenting the problem is provided.
The aim of this project is to develop a system which would report the largest segments appearing in two different texts written in natural language. Several parameters can be considered according to a threshold length of segments or to the similarity between segments. A preprocessing of the texts can transform them into sequences of integers (by hashing) in order to simplify further word comparisons.
You can submit your own project ideas in the following domains: algorithms and data structures, data compression, natural language analysis, bioinformatics, automata and formal languages, pattern matching, games, etc.