IP Paris

Data AI master

A Computer science master of the the Institut Polytechnique de Paris


M1 Students are invited to follow introductory courses while M2 students are invited to follow more advanced courses but, unless explicitly specified in the course description, all courses are opened for both M1 and M2 students.

Group "Softskills"

Softskills seminar (M2 only) (TPT-DATAAI941)

Go to course webpage. This course is taught by Fabian Suchanek.

Students learn how to give good presentations, and present scientific papers. This is an obligatory course of the M2 DataAI.

Prerequisites: 0

Program:

Group "Ethics"

AI Ethics (TPT-DATAAI951)

This course is taught by Maxwell Winston, Sophie Chabridon, Ada Diaconescu, Fabian Suchanek.

Algorithmic fairness, into to the AI Act, ethical issues/fundamental rights, explainability, privacy and security This course is scheduled on Tuesday afternoon in P2 (between 21/11/23 and 30/01/24, with no class 19/12/23, 26/12/23 and 02/01/24)

Prerequisites: 0

Program:

Group "Data AI basics"

Data AI basics (TPT-DATAAI900)

Go to course webpage. This course is taught by Tiphaine Viard, Louis Jachiet, Nils Holzenberger, Jean-Louis Dessalles.

This is an introductory course to many subjects in math/CS. There will be no exams for this course but also no ECTS.

Prerequisites: 0

Program:

Group "Logics"

Neuro-Symbolic Artificial Intelligence (TPT-IA206)

This course is taught by Nils Holzenberger.

Topics will include: - Prolog (recursivity, backtracking, unification) and DeepProbLog - Formal Logic (propositions, predicates, proof by refutation) - Natural language processing (DCG, parsing through unification) - Symbolic machine learning (symbolic induction, complexity minimum) - Knowledge representation (description logics, ontologies, semantic Web) - Probabilistic programming, binary and sentential decision diagrams, Boolean formulas

Prerequisites: 0

Program:

Logics and Symbolic AI (TPT-IA301)

Go to course webpage. This course is taught by Isabelle Bloch.

This course aims at providing the bases of symbolic AI, along with a few selected advanced topics. It includes courses on formal logics, ontologies, symbolic learning, typical AI topics such as revision, merging, etc., with illustrations on preference modeling and image understanding.

Prerequisites: Basic knowledge in algebra

Program:

Group "Databases"

Databases (TPT-SD202)

This course is taught by Mehwish Alam.

Relational databases: ER modeling, SQL, query execution, query optimization, schema refinement, application programming

Prerequisites: 0

Program:

Database management systems (X-INF553)

Go to course webpage. This course is taught by Ioana Manolescu.

Relational databases: ER modeling, SQL, query execution, query optimization, schema refinement, application programming

Prerequisites: 0

Program:

Group "Big Data Systems"

Architectures for Big Data (TPT-DATAAI921)

Go to course webpage. This course is taught by Ioana Manolescu.

The course presents the main architectures used for data management at a very large scale. Going beyond a single-site database, the course will present distributed data management architectures: mediator systems, peer-to-peer systems, structured data management in massively parallel settings such as MapReduce and Spark. Finally, the course pesents the main architectures and platforms for cloud-based data management in platform such as Amazon Web Services, Azure and Google Cloud.

Prerequisites: A previous introduction to the SQL query language is necessary. More advanced knowledge of query optimization would be a plus although it is not required. This course is intended for M2.

Program:

Big Data Infrastructures (TSP-CSC5003-1)

Go to course webpage. This course is taught by Julien Romero, Amel Bouzeghoub.

The course CSC 5003-1 – Big Data Infrastructure is a third-year course in an engineer school (level Master 2) given at Télécom SudParis. At the end of this course, a student will be able to setup a big data infrastructure using tools from the Hadoop ecosystem. In details, a student will know how to: 1. program in functional style using Scala 2. use the MapReduce framework to parallelize computations 3. explore and manipulate the Hadoop Distributed File System 4. process a data stream using Kafka and Spark Streaming 5. choose the right tools from the Hadoop ecosystem to solve a given problem

Prerequisites: Programming; Prior knowledge of functional programming can help

Program:

Systems for Big Data (X-INF583)

Go to course webpage. This course is taught by Yanlei Diao.

INF641

Prerequisites: The main background knowledge required for INF583 includes the relational operators, SQL, storage, and transaction processing. We also expect that they are aware of notions such as query plans and query optimisation, even though INF583 will not build on top of them, except a very small part of it. These prerequisites can be fulfilled by a typical database class such as INF553 (more in depth) or SD202 (more lightweight).

Program:

Group "Machine Learning"

Machine Learning: Shallow & Deep Learning (TPT-DATAAI901)

This course is taught by Mounim El Yacoubi.

Statistical Data Analysis (PCA, LDA), Unsupervised Learning, Clustering, Supervised Learning, Neural Networks / Deep Learning, Hidden Markov Mdoels (HMM), Restricted Boltzmann Machines, Support Vector Machines (SVM), Decision Trees, Random Forest, Boosting,Transfer Learning, Deep Reinforcement Learning, Introduction to LLM/ChatGPT

Prerequisites: Basics of Probability and Statistics; Basics of Algebra and Calculus

Program:

Explainable and Trustworthy AI (TPT-DATAAI902)

This course is taught by Mounim A. El Yacoubi.

Explainability and Interpretability of Machine / Deep Learning Models; Explanation Methods of Machine Learning models as black boxes: LIME, Shapley Values, SHAP, Counterfactual Explanations; Interpretation of Neural Networks as white boxes: Sensitivity Analysis, Layer-wise Relevance Propagation (LRP), The RETAIN architecture; Adversarial Learning, Targeted and Non-Targeted Adversarial Attacks, Defense against Adversarial Attacks; Verification of the Robustness of neural Networks.

Prerequisites: Knowledge of the basic concepts of Machine Learning and Deep Learning

Program:

Basics of image processing and analysis (TPT-DATAAI965)

Go to course webpage. This course is taught by Pietro Gori.

This is an introductory course to image processing intended for students who have a solid background in linear algebra, geometry, real and complex analysis and programming in Python. Throughout the course students will learn about: - Specificities of image data and image acquisition - Image sampling and quantization - Filtering and morphological image processing - Noise reduction and restoration - Segmentation - Image transformation and registration

Prerequisites: A solid background in linear algebra, geometry, real and complex analysis and programming in Python

Program:

Deep Learning for Computer Vision (TPT-DATAAI968)

This course is taught by Stéphane Lathuilière, Jhony Giraldo.

The course focuses on various advanced topics in the field. Students will delve into areas such as few-shot learning and domain adaptation, exploring techniques that enable models to learn from limited labeled data and adapt to new domains. The course also covers advanced methods for image and video generation and editing, allowing students to gain insights into cutting-edge approaches for creating and manipulating visual content. Classical vision tasks, including object detection and human pose estimation, are extensively studied, providing students with a strong foundation in fundamental computer vision techniques. Additionally, the course delves into video understanding, equipping students with the necessary tools to extract meaningful information from video data. Lastly, students will explore the integration of vision with other sensors, delving into the fusion of visual information with data from other sensing modalities, opening up new possibilities for perception and analysis. The course will be composed of five lectures and two practical sessions.

Prerequisites: Knowledge of the basic concepts of Machine Learning and Deep Learning

Program:

Representation Learning for Computer Vision and Medical Imaging (TPT-DATAAI969)

This course is taught by Pietro Gori (TP), Loic le Folgoc (TP).

Good and expressive data representations can improve the accuracy of machine learning problems and ease interpretability adn transfer. For vision tasks, handcrafting good data representations, a.k.a. feature engineering, was traditionally hard. Deep Learning has changed this paradigm by allowing to automatically discover good representations from data. This is known as representation learning. The objective of this course is to provide an introduction to representation learning in computer vision and medical imaging applications. Standard approaches to representation learning exploit the inductive bias of Convolutional Neural Networks and the supervision of labeled data. Since labeled data is scarce compared to raw data, recent work has turned to unsupervised and self-supervised techniques to boost the expressive power of representations. Furthermore alternatives to CNNs inspired by advances in NLP have been proposed, such as vision transformers. In a different development, causal representations, leveraging causal relationships in the data, allow to answer additional queries (causal effects, interventions, counterfactuals) compared to standard statistical models. All of these developments will be covered in the course. 'Each lecture is followed by a practical lab on the corresponding content where students learn to implement these techniques using the PyTorch framework.

Prerequisites: Basic concepts of Deep Learning, Linear Algebra, Calculus, Probability, Statistics, Image processing, Python

Program:

Probabilistic Models and Machine Learning (TPT-IA304)

This course is taught by Wojciech Pieczynski.

Bayes networks, hidden Markov models, theory of evidence, segmentation, filtering, smoothing. Examples of applications to image, finance, digital communications. REMOVED FOR 2023 because course taught in french.

Prerequisites: 0

Program:

Group "Fully optional courses"

Efficient resolution of logical models (ENSTA-IA303)

This course is taught by Alexandre Chapoutot.

In AI or in Software Verification, logical formulas pay a crucial role to represent knowledge or model a system. This course will present the main algorithms used to check the satisfiability or the non-satisfiability of formulas of Boolean logics. Extension of these algorithms to deal with more expressive logics will also be presented. Two applications in AI (Logical Knowledge-based agent) and in Software Verification will be presented to illustrate the use of logical formulation. For example, tasks as path planning, task planning or bounded model checking will be used to illustrate theoretical notions and practical implementation of algorithms.

Learning for robotics (ENSTA - IA305)

This course is taught by S.M. Nguyen.

Learning methods used in robotics and applications to human / robot interaction, learning by demonstration or autonomous learning: imitation learning, reinforcement learning, human motion analysis

Prerequisites: 0

Program:

Decision Procedures for Artificial Intelligence (ENSTA-INF656L)

Go to course webpage. This course is taught by Alexandre Chapoutot,Sergio Mover.

Reasoning automatically about logical formulas is crucial in solving problems in Artificial Intelligence (e.g., path and task planning) and Formal Methods (e.g., software verification). This course will present the modern, efficient algorithms (decision procedures) used to check the satisfiability (SAT) of formulas in propositional logics (e.g., Conflict Driven Clause Learning, CDCL) and the extensions of these algorithms to check more expressive first-order-logic formulas (Satisfiability Modulo Theory, SMT). The course will also present how logical modeling and satisfiability can solve problems in AI (Logical Knowledge-based agent) and formal methods (software verification). In detail, the tutorial will cover problems such as path planning, task planning, and bounded model checking to illustrate theoretical notions and practical implementation of algorithms.

Prerequisites: linear algebra, python programming

Program:

Computer vision (ENSTA- {IA323, Rob313})

Go to course webpage. This course is taught by Antoine Manzanera (ENSTA), Gianni Franchi (ENSTA), Marwane HARIAT (ENSTA), Rémi KAZMIERCZAK (ENSTA).

In today's digital age, computer vision plays a crucial role in numerous applications, ranging from image and video recognition to autonomous vehicles and augmented reality. This course aims to equip students with the knowledge and skills required to tackle complex visual tasks using cutting-edge techniques and models. The Advanced Computer Vision course is designed to provide students with a comprehensive understanding of state-of-the-art techniques and methodologies in computer vision. Through a combination of theoretical concepts and hands-on practical assignments, students will gain expertise in deep neural networks, generative models, uncertainty modeling, tracking, semi-supervised learning, and self-supervised learning. Throughout the course, students will work on hands-on projects and assignments to reinforce their understanding of the concepts covered. By the end of the course, students will be equipped with the skills to design, implement, and deploy advanced computer vision systems using deep neural networks.

Prerequisites: Linear Algebra, Differential Calculus, Probability and Statistics, Signal Processing

Program:

Emergence in Complex Systems (TPT-AthensTPT-09)

This course is taught by J.-L. Dessalles.

The course will cover several social phenomena, including: evolution theory, collective decision, the hawk-dove dilemma, cooperation, emergence of segregationism, altruism, the "tragedy of the commons", the "green-beard" effect, social coordination, suicide "for the group", honest communication, charity and competitive helping. Several theoretical models will be studied, including preferential attachment, kin selection, the Prisoner’s dilemma, the handicap principle, social signaling.

Prerequisites: Some basic knowledge of Python & object-oriented programming.

Program:

Image mining and content-based retrieval (TPT-DATAAI903)

Go to course webpage. This course is taught by Antoine Manzanera (ENSTA), Gianni Franchi (ENSTA), Flora Weissgerber (Onera), Henri Maître (TP).

This course deals with visual data (images and videos), and talks about image representation, processing and indexing, for content-based retrieval purposes. - It starts from image data and their different models, from mathematical and algorithms viewpoints, by exploring the different models: frequency-, discrete-, or set-based, differential, or statistical... - It presents segmentation and feature extraction techniques, i.e. how to reduce the representation support, and what local and global representations can be used to describe the image content. - Practical Work #1 deals with salient point detection, description and matching - Approximately on third of the course is dedicated to classification, detection and image recognition techniques based on machine learning, using CNN (one session) and other unsupervised and supervised techniques (one session). - Practical Work #2 deals with image classification using CNN. - One session is dedicated to a significant use case: satellite image mining. - One session is on video analysis and the importance of motion in video mining, with an emphasis on object tracking methods. - Practical Work #3 is on object tracking in videos. The practical works use Python, OpenCV and Pytorch. The evaluation is based on the 3 reports on the practical works (Weight 0.6), and a theoretical exam (Weight 0.4) This course is scheduled on Wednesday of P2 (November-December)

Prerequisites: Linear Algebra, Differential Calculus, Probability and Statistics, Signal Processing

Program:

Self-Organising Multi-Agent Systems (TPT-DATAAI961)

Go to course webpage. This course is taught by Ada Diaconescu.

The course provides an introduction to decentralised / collective intelligence, including concepts of: system self-adaptation, self-organisation, autonomic control, multi-scale feedbacks and agent-based modelling (MBA). Evaluation will rely on a practical project developed using a multi-agent simulation platform

Prerequisites: Good programming skills (any imperative language, like prolog, C, C++, Java, etc); notions that may help: control theory (also including robotics, automates, autonomous systems), AI (both symbolic and data-oriented); system modelling.

Program:

Knowledge Base Construction (TPT-DATAAI964)

Go to course webpage. This course is taught by Fabian Suchanek.

This course will discuss the automated construction of large knowledge bases. For this, we will cover the basics of knowledge representation, natural language processing (POS tagging, dependency parsing), information extraction (fact extraction, named entity recognition), and rule mining and disambiguation. We will see both classical/symbolic methods and deep learning methods for these tasks. https://suchanek.name/work/teaching/kbc-2021/

Prerequisites: 0

Program:

Programming with GPU for Deep Learning (TPT-IA307)

This course is taught by Elisabeth Brunet, Goran Frehse.

This course gives an introduction to GPU programming techniques used for deep learning. Starting from the ground up with basic matrix operations, students will develop code to implement classifiers based on gradient descent. Programs are written in C and use the CUDA API from Nvidia to access the GPU.

Prerequisites: 0

Program:

Natural Language Processing (was Machine Learning for Text Mining) (TPT-IA312)

This course is taught by Chloé Clavel, Matthieu Labeau.

Text mining is a progressing and challenging domain. For example, a lot of efforts have been recently dedicated to the development of methods able to analyze opinion data available on the social Web. The first objective of this course is to tackle the different methods of language processing and machine learning underlying text and opinion mining. During this course, the students will acquire theoretical and technical skill on advanced machine learning methods and natural language processing. This course is designed for students who will be attending classes and labs. The techniques and concepts that will be studied include: -natural language pre-processing : tokenization, part-of-speech tagging, document representation and word embeddings techniques -natural language resources : lexicons, wordnet and framenet -text clustering and text categorization : advanced machine learning methods such as deep learning, hidden markov models, etc.

Prerequisites: Basic knowledge in machine learning

Program:

Machine Learning in High Dimension (TPT-IA317)

Go to course webpage. This course is taught by Thomas Bonald.

This course presents various techniques for learning from high-dimensional data: dimensionality reduction, local sensitive hashing, nearest neighbors, bayesian algorithms, ensemble methods, sparse regression, anomaly detection.

Prerequisites: Probability theory, linear algebra, Python programming

Program:

Reinforcement Learning (TPT-IA318)

Go to course webpage. This course is taught by Thomas BONALD.

This is an introduction to reinforcement learning: Markov Decision Process, Bellman's equation, bandit algorithms, Q-learning, TD-learning, Monte-Carlo tree search. Applications to games and to recommender systems will be presented.

Prerequisites: Probability theory, Python programming

Program:

Sequence-to-Sequence Models for NLP and Speech Processing (TPT-IA327)

This course is taught by Nils Holzenberger, Mehwish Alam.

Natural language processing has given rise to innumerable industrial applications. While many new tasks have emerged in NLP and speech processing over the last decades, methods to solve them have increasingly converged towards a unified modeling paradigm. In this course, we will use sequence-to-sequence modeling to delve into state-of-the-art statistical machine learning methods — convolutional neural networks, recurrent neural networks, attention, transformers — and apply them to major NLP and speech processing tasks — language modeling, machine translation, speech recognition, information extraction. Students should expect to get an in-depth understanding of these methods, through theoretical analysis and hands-on lab sessions. Grading will involve a project, to be carried out over the course of the class. Topics to be covered 1. Recurrent Neural Networks 2. Hidden Markov models 3. Attention Mechanisms 4. Transformers 5. Convolutional Neural Networks 6. Language Modeling

Program:

Mining of Large Datasets (TPT-SD201)

Go to course webpage. This course is taught by Mauro Sozio.

The course will provide an introduction to data mining and will cover the following topics: clustering, decision trees, ranking, association rules, recommendation systems, introduction to MapReduce and Spark. Students will work on a project where they will implement some of the previously mentioned algorithms in Python or in Spark.

Prerequisites: 0

Program:

Graph Learning (TPT-SD212)

Go to course webpage. This course is taught by Thomas Bonald.

The focus of this course is on the analysis of large graphs. You will learn how to represent graphs efficiently as sparse matrices. You will apply some key algorithms to real graphs, for clustering, ranking, classifying and embedding nodes, including graph neural networks.

Prerequisites: Basics on graphs, probability theory, linear algebra, Python programming.

Program:

Semantic Networks (TSP-CSC5003-2)

This course is taught by Amel Bouzeghoub.

Semantic networks, logic (logic of predicates, logiqe of description, ...), reasoning, ontologies discover Semantic Web languages (RDF, RDFS, OWL, SPARQL) TP (protégé, jena).

Prerequisites: Java Programming

Program:

Data Visualization (X-INF552)

Go to course webpage. This course is taught by Emmanuel Pietriga (INRIA).

This course first gives an overview of the field of data visualization. It then discusses fundamental principles of human visual perception, focusing on how they help inform the design of visualizations. The following sessions focus on visualization techniques for specific data structures, and discuss them in depth from both design and implementation perspectives, including: multi-variate data, hierarchical structures, networks, time-series, statistical data and geographical data. All exercises are based on Web technologies, including the D3 software library (Data-Driven Documents) and the Vega-lite interactive graphics grammar. While positioned at different levels of abstraction, both enable developers to create a wide range of interactive, Web-based visualizations that run on a variety of platforms, ranging from desktop workstations to mobile devices.

Prerequisites: Basic knowledge of Web programming tech is a plus but not a requirement

Program:

Machine & Deep Learning Introduction (X-INF554)

Go to course webpage. This course is taught by M. Vazirgiannis.

The Machine Learning Pipeline Data Preprocessing and Exploration Feature Selection/Engineering & Dimensionality reduction Supervised Learning, Deep and Reinforcement Learning, Unsupervised Learning. This course will *probably* be scheduled on Monday on X calendar (spanning aross P1 and P2)

Prerequisites: 0

Program:

Topological Data Analysis (X-INF556)

Go to course webpage. This course is taught by Steve Oudot.

Objectives : Topological Data Analysis is an emerging trend in exploratory data analysis and data mining. It has known a growing interest and some notable successes in the recent years. The idea is to use topological tools to tackle challenging data sets, in particular data sets for which the observations lie on or close to non­trivial geometric structures that can fool classical techniques. Topological methods are indeed able to extract useful information about these geometric structures from the data, and to exploit that information to enhance the analysis pipeline. The objective of this course is to familiarize the students with this new topic lying at the confluence of pure mathematics, applied mathematics, and computer science. Emphasis is put on the methods and on their theoretical guarantees. Meanwhile, the lab sessions focus on challenging data sets, primarily multimedia data sets such as collections of images or 3d shapes. Content : The course is divided into nine lectures and nine exercise or lab sessions. These cover the main mathematical concepts and algorithmic tools involved in topological data analysis. The topics covered include: dimensionality reduction and its limitations, hierarchical versus density-based clustering, simplicial and singular homology, persistence theory, topological inference for data exploration, topological signatures for data classification, Reeb graphs and Mapper. Suggested readings: Gunnar Carlsson. Topology and Data, Bulletin of the American Mathematical Society Herbert Edelsbrunner and John Harer, Computational Topoogy: An Introduction, AMS press

Prerequisites: 0

Program:

Advanced Machine Learning and Autonomous Agents (X-INF581)

This course is taught by Jesse Read.

Probabilistic graphical models (Bayesian networks), Probabilistic inference, Deep-learning architectures, Sequential prediction and decision making, Bandits, Reinforcement learning (Q-Learning, Deep Q-Learning, Policy-Gradient Methods, Actor-Critic Methods). Although these topics are diverse and extensive, this course is developed around a common thread connecting them together. . This course will probably be scheduled on Wednesday on X calendar between January and March

Prerequisites: This course builds on any course(s) providing introductory concepts of Machine Learning (regression, classification, unsupervised learning, neural networks), and scientific programming in Python).

Program:

Advanced Deep Learning (X-INF581A)

Go to course webpage. This course is taught by Vicky Kalogeiton, Johannes Lutzeyer, Michalis Vazirgiannis (LIX).

The primary goal of this course is to introduce students to advanced principles of deep learning, including mathematical foundations, architecture design, and practical applications. This course is particularly relevant given the current state of the job market, where deep learning skills are in high demand in many industries, including tech, finance, healthcare, and entertainment. ECTS:5, Language: English

Prerequisites: Basic concepts of Deep Learning

Program:

Text Mining and NLP (X-INF582)

Go to course webpage. This course is taught by M. Vazirgiannis, Buscaldi.

Text preprocessing and Information Retrieval, graph-of-words, keyword extraction, Text categorization, topic modeling, supervised document classification, Word and document embeddings, unsupervised document classification with the Word Mover's Distance, Advanced deep learning architectures for NLP seq to seq tasks (HAN, ELMO, BERT/Transformer...), Lexical statistics and n-gram models, Sequence Labeling: Named Entity Recognition, POS-tagging, Introduction to Parsing, elements of Machine Translation, Semantics - Knowledge Bases, Relation Extraction

Prerequisites: 0

Program:

Introduction to the verification of neural networks (X-INF641)

This course is taught by Sylvie Putot, Eric Goubault.

Neural networks are widely used in numerous applications including safety-critical ones such as control and planning for autonomous systems. A central question is how to verify that they are correct with respect to some specification. Beyond correctness or robustness, we are also interested in questions such as explainability and fairness, that can in turn be specified as formal verification problems. In this course, we will see how formal methods approaches introduced in the context of program verification can be leveraged to address the verification of neural networks. BEWARE, despite being hosted at X, this course is only 24h so 2.5 ECTS!

Prerequisites: 0

Program:

Navigation for autonomous systems (X-INF657G)

Go to course webpage. This course is taught by D. Filliat.

We will give an overview of algorithmic aspects of Mobile Robotics and autonomous vehicles. We will cover the most common robotics platform and sensors (vision, 3D ultrasound, accelerometers, odometry) and the various navigation components: control; obstacle avoidance; localization; mapping (SLAM) and planning along with  filtering (Kalman filter, particle filtering  etc ...) and optimisation techniques used in these areas. BEWARE, despite being hosted at X, this course is only 24h so 2.5 ECTS!

Prerequisites: 0

Program: