Computer Science Talks


PDF slides for talks from the past few years covering topics from the design of algorithms and data structures to natual language processing and bioinformatics.

Text Analysis with Enhanced Annotated Suffix Trees

Text Analysis with Enhanced Annotated Suffix Trees

Annotated suffix trees is an extension of the classic suffix tree data structure. It has a range of interesting applications in natural language processing, such as language-independent information retrieval. We present an enhanced implementation of this data structure where its internal representation is based on suffix arrays instead of the traditional but heavyweight suffix trees. This makes it possible to achieve significant improvements in performance with respect to both memory usage and runtime. See the paper for more details.


Sublinear Geometric Algorithms

Sublinear Geometric Algorithms

We start by discussing the main approaches to sublinear algorithm design as well as the basic types of randomized algorithms. We then go into more detail about three randomized Las Vegas algorithms with no extra preprocessing for three geometric problems: a 1D problem (successor search), a 2D problem (polygonal intersection), and a 3D problem (polyhedral intersection).


String Processing in NLP, Biology and Finance

String Processing in NLP, Biology and Finance

An overview of classic string processing algorithms (full-text indexing, sequence alignment, word embeddings) with a few example applications in natural language processing, biology and finance.


Refal

Refal: Recursive Functions' Algorithmic Language

An introduction to Refal, a unique functional programming language based on Markov algorithms and term rewriting (in Russian). See the corresponding GitHub repository for a few code snippets.


Burrows-Wheeler Transform

Burrows-Wheeler Transform

The Burrows-Wheeler Transform is a transformation on words widely used in bioinformatics as a way of building practical full-text indexes for genomes. We give a description of the transform and go into some of its most importart properties.


Tree Automata

Tree Automata

Automata on infinite objects, in particular trees, play an important role in computer science: they allow to model nonterminating systems and are more suitable than words when nondeterminism needs to be modelled. We give an introduction to tree automata and present one of the proofs of a theorem stating that the class of languages recognized by finite-state tree automata is closed under complementation.


Cylindrical Algebraic Decomposition

Cylindrical Algebraic Decomposition

Cylindrical Algebraic Decomposition (CAD) is a general tool for dealing with semilagebraic sets that allows to solve a wide rage of problems: quantifier elimination over the reals, tests for emptiness, finiteness, connectedness of semialgebraic sets, computing connected components of a given semialgebraic set etc. We introduce the concept and then describe the first algorithm for computing the CADs, the Collins algorithm.