UE Fundamentals of Data Processing and Distributed Knowledge

Degrees incorporating this pedagocial element :

Description

Modern computing increasingly takes advantage of large amounts of distributed data and knowledge. This is grounded on theoretical principles borrowing to several fields of computer science such as programming languages, data bases, structured documentation, logic and artificial intelligence. The goal of this course is to present some of them, the problems that they solve and those that they uncover. The course considers two perspectives on data and knowledge: interpretation (what they mean), analysis (what they reveal) and processing (how can they be traversed efficiently and transformed safely).

The course offers a semantic perspective on distributed knowledge. Distributed knowledge may come from data sources using different ontologies on the semantic web, autonomous software agents learning knowledge or social robots interacting with different interlocutors. The course adopts a synthetic view on these. It first presents principles of the semantics of knowledge representation (RDF, OWL). Ontology alignments are then introduced to reduce the heterogeneity between distributed knowledge and their exploitation for answering federated queries is presented. A practical way for cooperating agents to evolve their knowledge is cultural knowledge evolution that is then illustrated. Finally, the course defines dynamic epistemic logics as a way to model the communication of knowledge and beliefs.

The course also introduces a perspective on programming language foundations, algorithms and tools for processing structured information, and in particular tree-shaped data. It consists in an introduction to relevant theoretical tools with an application to NoSQL (not only SQL) and XML technologies in particular. Theories and algorithmic toolboxes such as fundamentals of tree automata and tree logics are introduced, with applications to practical problems found for extracting information. Applications include efficient query evaluation, memory-efficient validation of document streams, robust type-safe processing of documents, static analysis of expressive queries, and static type-checking of programs manipulating structured information. The course also aims at presenting challenges, important results, and open issues in the area.

Recommended prerequisite

Ability to logical reasoning (logic knowledge may help) Some knowledge on discrete mathematics (tree, graphs)