IP Paris

Data AI master

A Computer science master of the the Institut Polytechnique de Paris

Group "Softskills"

Softskills seminar (M2 only) (TPT-DATAAI941 Softskills seminar)

Go to course webpage. This course is taught by Fabian Suchanek.

Students learn how to give good presentations, and present scientific papers. This is an obligatory course of the M2 DataAI.


Group "Ethics"

AI Ethics (TPT-DATAAI951)

This course is taught by Maxwell Winston, Sophie Chabridon, Ada Diaconescu, Fabian Suchanek.

Algorithmic fairness, ethical issues, privacy and security This course is scheduled on Tuesday afternoon in P2 (between 22/11/21 and 11/02/22)


Group "Data AI basics"

Data AI basics (TPT-DATAAI900)

Go to course webpage. This course is taught by Angelos Anadiotis, Tiphaine Viard, Louis Jachiet, Fabian Suchanek, Jean-Louis Dessalles.

This is an introductory course to many subjects in math/CS. There will be no exams for this course but also no ECTS.


Group "Machine Learning"

Machine Learning: Shallow & Deep Learning (TPT-DATAAI901)

This course is taught by Mounim El Yacoubi.

Deep learning/Réseaux de neurones, Modèles de Markov cachés (HMM), Restricted Boltzmann Machines, Unsupervised Learning, Supervised Learning, Analyse de données (PCA, LDA), Support Vector Machines (SVM), Decision Trees, Transfer Learning, Adversarial Models, Deep Reinforcement Learning

If you take this course you cannot take Machine & Deep Learning Introduction


Machine & Deep Learning Introduction (X-INF554)

This course is taught by M. Vazirgiannis.

The Machine Learning Pipeline Data Preprocessing and Exploration Feature Selection/Engineering & Dimensionality reduction Supervised Learning, Deep and Reinforcement Learning, Unsupervised Learning. This course will *probably* be scheduled on Monday on X calendar (spanning aross P1 and P2)

If you take this course you cannot take Machine Learning: Shallow & Deep Learning


Group "Logics"

Logics and Symbolic AI (TPT-IA301)

Go to course webpage. This course is taught by Isabelle Bloch.

This course aims at providing the bases of symbolic AI, along with a few selected advanced topics. It includes courses on formal logics, ontologies, symbolic learning, typical AI topics such as revision, merging, etc., with illustrations on preference modeling and image understanding.

Prerequisites: Basic knowledge in algebra

If you take this course you cannot take Logic & Knowledge representation


Logic & Knowledge representation (TPT-SD206)

Go to course webpage. This course is taught by Jean-Louis Dessalles.

Prolog (recursivity, backtracking, unification) Formal Logic (propositions, predicates, proof by refutation) Natural language processing (DCG, parsing through unification) Symbolic machine learning (symbolic induction, complexity minimum) Knowledge representation Problem solving

If you take this course you cannot take Logics and Symbolic AI


Group "Big Data Systems"

Systems for Big Data (X-INF583)

Go to course webpage. This course is taught by Angelos Anadiotis / Yanlei Diao.

The course follows X-INF553 and covers topics on concurrency control, parallel, and distributed data processing, storage layouts, and execution models. The course takes a systems-oriented approach and first explains the effect of different parts of the computer architecture to the query execution. Then, it moves to the fundamentals of parallelisation, including threads, processes and low-level synchronisation, before it moves to the coordination of transactions, both in scale-up and scale-out settings. After synchronisation, the course focuses on data analytics, by starting from low-level issues like memory layouts and their interaction with query operators and finally moves towards well-established scale-out platforms like Hadoop and Spark. This course will be scheduled on Friday afternoon on X's calendar from January to March.

Prerequisites: The main background knowledge required for INF583 includes the relational operators, SQL, storage, and transaction processing. We also expect that they are aware of notions such as query plans and query optimisation, even though INF583 will not build on top of them, except a very small part of it. These prerequisites can be fulfilled by a typical database class such as INF553 (more in depth) or SD202 (more lightweight).

If you take this course you cannot take Big data infrastructures


Architectures for Big Data (TPT-DATAAI921)

This course is taught by Ioana Manolescu.

The course presents the main architectures used for data management at a very large scale. Going beyond a single-site database, the course will present distributed data management architectures: mediator systems, peer-to-peer systems, structured data management in massively parallel settings such as MapReduce and Spark. Finally, the course pesents the main architectures and platforms for cloud-based data management in platform such as Amazon Web Services, Azure and Google Cloud.

Prerequisites: A previous introduction to the SQL query language is necessary. More advanced knowledge of query optimization would be a plus although it is not required. This course is intended for M2.

If you take this course you cannot take Big data infrastructures


Group "Databases"

Database management systems (X-INF553)

Go to course webpage. This course is taught by Ioana Manolescu.

Relational databases: ER modeling, SQL, query execution, query optimization, schema refinement, application programming

If you take this course you cannot take Databases or Databases (slot D)


Databases (TPT-SD202)

This course is taught by Louis Jachiet, Antoine Amarilli.

Relational databases: ER modeling, SQL, query execution, query optimization, schema refinement, application programming

If you take this course you cannot take Database management systems or Databases (slot D)


Databases (slot D) (TPT-SD202D)

This course is taught by Louis Jachiet, Antoine Amarilli.

Relational databases: ER modeling, SQL, query execution, query optimization, schema refinement, application programming Note this exactly the same content as for TPT-SD202, just at a different date: P4 (mid-april to mi-june) on Tuesday mornings.

If you take this course you cannot take Database management systems or Databases


Group "Fully optional courses"

Advanced Machine Learning and Autonomous Agents (X-INF581-1)

This course is taught by Jesse Read.

This course selects a number of advanced topics to explore in machine learning and autonomous agents, in particular: Probabilistic graphical models (Bayesian networks), Multi-output and structured-output prediction problems, Deep-learning architectures, Methods of search and optimization (Beam search, Monte Carlo methods, Sequential prediction and decision making (HMMs, Sequential Monte Carlo, Bayesian Filtering, MDPs, ...), leading on to a strong component of Reinforcement learning (Q-Learning, Deep Q-Learning, Policy-Gradient Methods, Actor-Critic Methods). Although these topics are diverse and extensive, this course is developed around a common thread connecting them together. The first part of the course studies and extends a selection of advanced concepts and topics in Machine Learning. The second half of the course moves towards Reinforcement Learning and Autonomous Agents in general. This course will probably be scheduled on Wednesday on X calendar between January and March

Prerequisites: This course builds on any course(s) providing introductory concepts of Machine Learning (regression, classification, unsupervised learning, neural networks), and scientific programming in Python).


Machine Learning in High Dimension (TPT-IA317)

Go to course webpage. This course is taught by Thomas Bonald.

Sparse data, dimensionality reduction, sketching & projections techniques, nearest-neighbor methods


Reinforcement Learning (TPT-IA318)

Go to course webpage. This course is taught by Thomas Bonald, Claire Vernade, Till Wolhfarth.

Reinforcement learning: bandit algorithms, Q-learning, deep Q-learning, Monte-Carlo tree search.


Probabilistic Models and Machine Learning (TPT-IA304)

This course is taught by Wojciech Pieczynski.

Bayes networks, hidden Markov models, theory of evidence, segmentation, filtering, smoothing. Examples of applications to image, finance, digital communications.


Programming with GPU for Deep Learning (TPT-IA307)

This course is taught by Elisabeth Brunet, Goran Frehse.

This course gives an introduction to GPU programming techniques used for deep learning. Starting from the ground up with basic matrix operations, students will develop code to implement classifiers based on gradient descent. Programs are written in C and use the CUDA API from Nvidia to access the GPU.


Algorithmic information and artificial intelligence (TPT-IA325)

Go to course webpage. This course is taught by J.-L. Dessalles.

Algorithmic information is a great conceptual tool to understand Artificial Intelligence. It describes what AI actually does, and it can help making optimal choices. Machine learning, decision making, randomness, probability, anomaly, analogy, interest, even the very act of understanding, all these things make sense in the light of Algorithmic Information Theory. Content: Kolmogrov complexiy applied to ML, to AI problems (meaning similarity, relevant descriptions), to maths (alg. probability, randomness, Gödel th.), to cognitive science (relevance, interest, aesthetics...). For 2022, this course is scheduled on Thursday morning in P3 (February and March)

Prerequisites: 'Basic knowledge of Python. Some knowledge of ML (e.g. k-means). Some math background (log, rational numbers, series...) and basic concepts of computer science (binary coding, Turing machines...)


Multimodal Dialogue (TPT-DATAAI966)

This course is taught by C. Clavel, G. Varni.

Introduction to multimodal human-agent dialogue. Emotional recognition, gesture recognition, speech synthesis, multimodal dialogue system, interaction analysis This course provides students with foundational conceptual knowledge, methodologies, and tools for designing, implementing, and evaluating intelligent machines able to engage users in a multimodal dialogue.  This requires students to know and apply computational methods for capturing, representing, automatically analyzing the behavior of the users, and generating the behavior of the machines. At the end of the course, the student will: Understand the principles of multimodal communication and its open challenges; Know and understand the motivations for using multimodality for designing intelligent machines Know and understand computational methods for managing the dialogue through the following communication modalities: speech, movement, and facial expressions Know and understand the foundations of conversational analysis


Emergence in Complex Systems (TPT-AthensTPT-09)

Go to course webpage. This course is taught by J.-L. Dessalles.

The course will cover several social phenomena, including: evolution theory, collective decision, the hawk-dove dilemma, cooperation, emergence of segregationism, altruism, the "tragedy of the commons", the "green-beard" effect, social coordination, suicide "for the group", honest communication, charity and competitive helping. Several theoretical models will be studied, including preferential attachment, kin selection, the Prisoner’s dilemma, the handicap principle, social signaling.

Prerequisites: Some basic knowledge of Python & object-oriented programming.


Self-Organising Multi-Agent Systems (TPT-DATAAI961)

This course is taught by Ada Diaconescu.

Self-adaptation, self-organisation, autonomic control, multi-agent systems: architectures, design patterns, service-oriented platforms, practical project based on smart-home simulator

Prerequisites: Good programming skills (any imperative language, like prolog, C, C++, Java, etc); notions that may help: control theory (also including robotics, automates, autonomous systems), AI (both symbolic and data-oriented); system modelling.


Data Stream Mining (TPT-DATAAI962)

This course is taught by Albert Bifet & Jesse Read.

Real-time Machine Learning for data streams or data stream mining relies on and develops new incremental algorithms that process streams under strict resource limitations. This course focuses on, as well as extends the methods implemented in open source tools as MOA and River. This course will *probably* be scheduled on Tuesday afternoon of P3.


Mining of Large Datasets (TPT-SD201)

Go to course webpage. This course is taught by Mauro Sozio, Tiphaine Viard.

The course will provide an introduction to data mining and will cover the following topics: clustering, decision trees, ranking, association rules, recommendation systems, introduction to MapReduce and Spark. Students will work on a project where they will implement some of the previously mentioned algorithms in Python or in Spark.


Graph Mining (TPT-SD212)

Go to course webpage. This course is taught by Thomas Bonald.

Graph analysis, sparse data, clustering, PageRank, classification, graph embedding, spectral methods, diffusion methods


Navigation for autonomous systems (2X-INF657G)

Go to course webpage. This course is taught by D. Filliat.

We will give an overview of algorithmic aspects of Mobile Robotics and autonomous vehicles. We will cover the most common robotics platform and sensors (vision, 3D ultrasound, accelerometers, odometry) and the various navigation components: control; obstacle avoidance; localization; mapping (SLAM) and planning along with  filtering (Kalman filter, particle filtering  etc ...) and optimisation techniques used in these areas. Dates for 2022 : all day for all Mondays of January : 09/01, 16/01, 23/01, 30/01. BEWARE, despite being hosted at X, this course is only 24h so 2.5 ECTS!

Knowledge Base Construction (TPT-DATAAI964)

Go to course webpage. This course is taught by Fabian Suchanek.

This course will discuss the automated construction of large knowledge bases. For this, we will cover the basics of knowledge representation, natural language processing (POS tagging, dependency parsing), information extraction (fact extraction, named entity recognition), and rule mining and disambiguation. We will see both classical/symbolic methods and deep learning methods for these tasks. https://suchanek.name/work/teaching/kbc-2021/


Text Mining and NLP (X-INF582)

This course is taught by M. Vazirgiannis, Buscaldi.

Text preprocessing and Information Retrieval, graph-of-words, keyword extraction, Text categorization, topic modeling, supervised document classification, Word and document embeddings, unsupervised document classification with the Word Mover's Distance, Advanced deep learning architectures for NLP seq to seq tasks (HAN, ELMO, BERT/Transformer...), Lexical statistics and n-gram models, Sequence Labeling: Named Entity Recognition, POS-tagging, Introduction to Parsing, elements of Machine Translation, Semantics - Knowledge Bases, Relation Extraction


Natural Language Processing (was Machine Learning for Text Mining) (TPT-IA312)

This course is taught by Chloé Clavel.

Text mining is a progressing and challenging domain. For example, a lot of efforts have been recently dedicated to the development of methods able to analyze opinion data available on the social Web. The first objective of this course is to tackle the different methods of language processing and machine learning underlying text and opinion mining. During this course, the students will acquire theoretical and technical skill on advanced machine learning methods and natural language processing. This course is designed for students who will be attending classes and labs. The techniques and concepts that will be studied include: -natural language pre-processing : tokenization, part-of-speech tagging, document representation and word embeddings techniques -natural language resources : lexicons, wordnet and framenet -text clustering and text categorization : advanced machine learning methods such as deep learning, hidden markov models, etc.

Prerequisites: Basic knowledge in machine learning


Cognitive approach to NLP (TPT-SD213)

Go to course webpage. This course is taught by Jean-Louis Dessalles.

This course explores future possible avenues to Natural Language Processing that are inspired by human cognitive processes. Content: Basic parsing methods. Knowledge representation – Meaning representation – Procedural semantics – Aspect. Relevance, argumentation. THIS COURSE IS *NOT* FOR STUDENTS WHO WANT TO ACQUIRE STATE-OF-THE-ART OPERATIONAL SKILLS.

Prerequisites: Best if you followed SD206. Otherwhise: some knowledge of logic programming, formal grammars, parsing.


Data Visualization (X-INF552)

Go to course webpage. This course is taught by Emmanuel Pietriga (INRIA).

This course first gives an overview of the field of data visualization. It then discusses fundamental principles of human visual perception, focusing on how they help inform the design of visualizations. The following sessions focus on visualization techniques for specific data structures, and discuss them in depth from both design and implementation perspectives, including: multi-variate data, hierarchical structures, networks, time-series, statistical data and geographical data. All exercises are based on Web technologies, including the D3 software library (Data-Driven Documents) and the Vega-lite interactive graphics grammar. While positioned at different levels of abstraction, both enable developers to create a wide range of interactive, Web-based visualizations that run on a variety of platforms, ranging from desktop workstations to mobile devices.


Basics of image processing and analysis (TPT-DATAAI965)

Go to course webpage. This course is taught by Pietro Gori.

This is an introductory course to image processing intended for students who have a solid background in linear algebra, geometry, real and complex analysis and programming in Python. Throughout the course students will learn about: - Specificities of image data and image acquisition - Image sampling and quantization - Filtering and morphological image processing - Noise reduction and restoration - Segmentation - Image transformation and registration

Prerequisites: A solid background in linear algebra, geometry, real and complex analysis and programming in Python


Topological Data Analysis (X-INF556)

Go to course webpage. This course is taught by Steve Oudot.

Objectives : Topological Data Analysis is an emerging trend in exploratory data analysis and data mining. It has known a growing interest and some notable successes in the recent years. The idea is to use topological tools to tackle challenging data sets, in particular data sets for which the observations lie on or close to non­trivial geometric structures that can fool classical techniques. Topological methods are indeed able to extract useful information about these geometric structures from the data, and to exploit that information to enhance the analysis pipeline. The objective of this course is to familiarize the students with this new topic lying at the confluence of pure mathematics, applied mathematics, and computer science. Emphasis is put on the methods and on their theoretical guarantees. Meanwhile, the lab sessions focus on challenging data sets, primarily multimedia data sets such as collections of images or 3d shapes. Content : The course is divided into nine lectures and nine exercise or lab sessions. These cover the main mathematical concepts and algorithmic tools involved in topological data analysis. The topics covered include: dimensionality reduction and its limitations, hierarchical versus density-based clustering, simplicial and singular homology, persistence theory, topological inference for data exploration, topological signatures for data classification, Reeb graphs and Mapper. Suggested readings: Gunnar Carlsson. Topology and Data, Bulletin of the American Mathematical Society Herbert Edelsbrunner and John Harer, Computational Topoogy: An Introduction, AMS press


Image mining and content-based retrieval (TPT-DATAAI903)

Go to course webpage. This course is taught by Antoine Manzanera (ENSTA), David Filliat (ENSTA), Isabelle Bloch (TP), Henri Maître (TP).

This course deals with visual data (images and videos), and talks about image representation, processing and indexing, for content-based retrieval purposes. - It starts from image data and their different models, from mathematical and algorithms viewpoints, by exploring the different models: frequency-, discrete-, or set-based, differential, or statistical... - It presents segmentation and feature extraction techniques, i.e. how to reduce the representation support, and what local and global representations can be used to describe the image content. - Practical Work #1 deals with salient point detection, description and matching - Approximately on third of the course is dedicated to classification, detection and image recognition techniques based on machine learning, using CNN (one session) and other unsupervised and supervised techniques (one session). - Practical Work #2 deals with image classification using CNN. - One session is dedicated to a significant use case: satellite image mining. - One session is on video analysis and the importance of motion in video mining, with an emphasis on object tracking methods. - Practical Work #3 is on object tracking in videos. The practical works use Python, OpenCV and Pytorch. The evaluation is based on the 3 reports on the practical works (Weight 0.6), and a theoretical exam (Weight 0.4) This course is scheduled on Wednesday morning of P1 see schedule here https://perso.ensta-paris.fr/~manzaner/Cours/Masters_ParisSaclay/Image_Mining/


Recents Trends in Deep Learning (TPT-DATAAI902)

This course is taught by Mounim El Yacoubi.

Transfer Learning; Adversarial Learning; Explainability and Interpretability of Neural Networks; Verification of the Robustness of neural Networks

Prerequisites: Knowledge of Machine Learning and the basic concepts of Deep learning


Data Science in Practice (TPT-DATAAI967)

This course is taught by Mariam Barry.

The course aims to provide a set of fundamental topics aligned with practical skills required to master data science, with an important focus on applications (Lab/TP), real-world case studies, and complex data types (streaming or heterogeneous). The first part will cover how to build and deploy models from semi-structured and graph data using API and GraphQL. The second part covers data pipelines, automation using Docker-based technology and the reproducibility of research experiments. The last part is about implementing a project using AI technologies (models or tools) to address a complete data science case study with a fully connected data pipeline. The course is relevant for both students looking for a data science (or data engineering) position in industry and future PhD students looking to up-skill themselves with some practical skills useful for the thesis and research projects with open data.


Introduction to the verification of neural networks (2X-INF641)

This course is taught by Sylvie Putot, Eric Goubault.

Neural networks are widely used in numerous applications including safety-critical ones such as control and planning for autonomous systems. A central question is how to verify that they are correct with respect to some specification. Beyond correctness or robustness, we are also interested in questions such as explainability and fairness, that can in turn be specified as formal verification problems. In this course, we will see how formal methods approaches introduced in the context of program verification can be leveraged to address the verification of neural networks. Dates for 2022 : all day for all Mondays of January : 09/01, 16/01, 23/01, 30/01. BEWARE, despite being hosted at X, this course is only 24h so 2.5 ECTS!