Matteo Pirotta

Research Scientist at Facebook AI Research

About Me

I am research scientist at Facebook AI Research in Paris. Previously, I was postdoc at INRIA Lille - Nord Europe in the SequeL team for almost two years. Before I was postdoc at Politecnico di Milano. I have received my PhD in computer science at Politecnico di Milano, under the supervision of Luca Bascetta and Marcello Restelli.

My research interest is machine learning. In particular I am interested in reinforcement learning, transfer learning and online learning.

More details in my CV.

Contacts:
Email: matteo DOT pirotta AT gmail.com

Github:
https://github.com/teopir

News

  • Jean Tarbouriech will present our recent paper "Stochastic Shortest Path: Minimax, Parameter-Free and Towards Horizon-Free Regret" at the RL Theory Seminar. 25.06.2021
  • 1 Paper accepted at AISTATS'21, 1 at ALT'21 and 2 at ICML'21. 25.06.2021
  • Happy to announce that I will be talking about exploration-exploitation in Deep RL at the virtual school RLVS-ANITI. Here a draft of the slides 02.04.2021
  • Long time after the last update! I will be guest host of the RL Theory Seminar organized by Gergely Neu, Ciara Pike-Burke and Csaba Szepesvari. 01.02.2021
  • Busy February! I spent the last two weeks in Ghana teaching Reinforcerment Learning at AIMS AMMI. It has been a wonderful and enriching experience. Please visit the AMMI website to know more about this very nice initiative. 01.03.2020
  • I gave a tutorial on exploration-exploitation in RL with M. Ghavamzadeh and A. Lazaric at AAAI'20. You can find the material at this page rlgammazero.github.io. 20.02.2020
  • 1 Paper accepted at AAAI'20 and 1 at AISTATS'20. 16.01.2020
  • 2 Papers accepted at NeurIPS'19. 10.9.2019
  • I gave a tutorial on policy gradient and actor-critic at the Reinforcement Learning Summer School (RLSS) in Lille. It is always nice to be back in Lille and meet with the amazing people in Sequel! Very well organized summer school. 15.7.2019
  • Heading to Chicago where, together with Ronan and Alessandro, I will give a tutorial on regret minimization in reinforcement learning at ALT'19. Visit out wesite for more info rlgammazero.github.io 20.3.2019
  • I've been invited to give a talk at ARWL'18 in Beijing, China. I will talk about regret minimization (exploration-exploitation) in RL with prior knowledge (slides). I've been also invited to give the same talk at MSRA in Beijing. 6.11.2018
  • Going to NeurIPS! I've received a free registration as one of the "top" reviewers. Moreover, I have one paper accepted at NeurIPS'18. 29.9.2018
  • I'm really happy to announce that I've been selected for a research position (CR) at INRIA - Lille (link). I've even more happy to announce that I will join Facebook AI Research (Paris) in October 2018. 30.7.2018
  • Busy April! I have been giving several talks on exploration-exploitation in RL: Politecnico di Milano (Apr 03), Facebook Paris (Apr 17) and Google Zurich (Apr 27). 1.6.2018
  • 3 papers accepted at ICML'18.
  • I am organizing the 14th European workshop on reinforcement learning (EWRL 2018) .
  • ICML/IJCAI workshop on Prediction and Generative Modeling in Reinforcement Learning (PGMRL).
    Organizers: Me, Roberto Calandra (UC Berkeley), Sergey Levine (UC Berkeley), Martin Riedmiller (DeepMind), Alessandro Lazaric (Facebook).
  • Ronan Fruit and I are developping a Python library for Exploration-Exploitation in Reinforcement Learning.
    It is available on GitHub.
  • I'm going to visit Berlin and I'll give a talk at Amazon (Mar 19, 2018) on Efficient Exploration-Exploitation in RL. 2.3.2017
  • 3 papers accepted at NIPS 2017.
  • I'm going to spend two weeks in California. I will visit UC Berkeley and I'll give a talk on Regret Minimization in MDPs with Options (Jul 14, 2017). I will then spend one week at Stanford University. 1.6.2017

Publications

Preprints
  • Andrea Tirinzoni, Matteo Pirotta, Alessandro Lazaric:
    A Fully Problem-Dependent Regret Lower Bound for Finite-Horizon MDPs. arXiv:2106.13013, 2021. [arXiv]
  • Jean Tarbouriech, Runlong Zhou, Simon S. Du, Matteo Pirotta, Michal Valko, Alessandro Lazaric:
    Stochastic Shortest Path: Minimax, Parameter-Free and Towards Horizon-Free Regret. arXiv:2104.11186, 2021. [arXiv]
  • Yunchang Yang, Tianhao Wu, Han Zhong, Evrard Garcelon, Matteo Pirotta, Alessandro Lazaric, Liwei Wang, Simon S. Du:
    A Unified Framework for Conservative Exploration. arXiv:2106.11692, 2021. [arXiv]
  • Evrard Garcelon, Vianney Perchet, Matteo Pirotta:
    Homomorphically Encrypted Linear Contextual Bandit. arXiv:2103.09927, 2021. [arXiv]
  • Evrard Garcelon, Vianney Perchet, Ciara Pike-Burke and Matteo Pirotta:
    Local Differentially Private Regret Minimization in Reinforcement Learning. arXiv:2010.07778, 2020. [arXiv]
  • Jean Tarbouriech, Matteo Pirotta, Michal Valko and Alessandro Lazaric:
    A Provably Efficient Sample Collection Strategy for Reinforcement Learning. arXiv:2007.06437, 2020. [arXiv]
  • Yonathan Efroni, Shie Mannor and Matteo Pirotta:
    Exploration-Exploitation in Constrained MDPs. arXiv:2003.02189, 2020. [arXiv]
Conference Papers
  • Matteo Papini, Andrea Tirinzoni, Marcello Restelli, Alessandro Lazaric, Matteo Pirotta:
    Leveraging Good Representations in Linear Contextual Bandits. ICML 2021, Virtual. [arXiv]
  • Omar Darwiche Domingues, Pierre Menar, Matteo Pirotta, Emilie Kaufmann and Michal Valko:
    Kernel-Based Reinforcement Learning: A Finite-Time Analysis. ICML 2021, Virtual. [arXiv]
  • Jean Tarbouriech, Matteo Pirotta, Michal Valko and Alessandro Lazaric:
    Sample Complexity Bounds for Stochastic Shortest Path with a Generative Model. ALT 2021, Virtual. [paper]
  • Omar Darwiche Domingues, Pierre Menar, Matteo Pirotta, Emilie Kaufmann and Michal Valko:
    A Kernel-Based Approach to Non-Stationary Reinforcement Learning in Metric Spaces AISTATS 2021, Virtual. [arXiv]
  • Andrea Tirinzoni, Matteo Pirotta, Marcello Restelli and Alessandro Lazaric:
    An Asymptotically Optimal Primal-Dual Incremental Algorithm for Linear Contextual Bandits. NeurIPS 2020, Virtual. [arXiv]
  • Jean Tarbouriech, Matteo Pirotta, Michal Valko and Alessandro Lazaric:
    Improved Sample Complexity for Incremental Autonomous Exploration in MDPs. NeurIPS 2020, Virtual. [arXiv]
  • Evrard Garcelon, Baptiste Roziere, Laurent Meunier, Jean Tarbouriech, Olivier Teytaud, Alessandro Lazaric and Matteo Pirotta:
    Adversarial Attacks on Linear Contextual Bandits. NeurIPS 2020, Virtual. [arXiv]
  • Jean Tarbouriech, Shubhanshu Shekhar, Matteo Pirotta, Mohammad Ghavamzadeh, Alessandro Lazaric:
    Active Model Estimation in Markov Decision Processes UAI 2020, Virtual. [arXiv][paper]
  • Evrard Garcelon, Mohammad Ghavamzadeh, Alessandro Lazaric and Matteo Pirotta:
    Conservative Exploration in Reinforcement Learning. AISTATS 2020, Palermo, Italy. [arXiv]
  • Andrea Zanette, David Brandfonbrener, Emma Brunskill, Matteo Pirotta and Alessandro Lazaric:
    Frequentist Regret Bounds for Randomized Least-Squares Value Iteration. AISTATS 2020, Palermo, Italy. [arXiv]
  • Evrard Garcelon, Mohammad Ghavamzadeh, Alessandro Lazaric and Matteo Pirotta:
    Improved Algorithms for Conservative Exploration in Bandits. AAAI 2020, New York, USA. [arXiv]
  • Ronald Ortner, Matteo Pirotta, Alessandro Lazaric, Ronald Fruit and Odalrici-Ambrym Maillard:
    Regret Bounds for Learning State Representations in Reinforcement Learning. NeurIPS 2019, Vancouver, Canada.
  • Jian Qian, Ronan Fruit, Matteo Pirotta and Alessandro Lazaric:
    Exploration Bonus for Regret Minimization in Discrete and Continuous Average Reward MDPs. NeurIPS 2019, Vancouver, Canada. [arXiv] [Paper]
  • Ronan Fruit, Matteo Pirotta and Alessandro Lazaric:
    Near Optimal Exploration-Exploitation in Non-Communicating Markov Decision Processes. NeurIPS 2018, Montréal, Canada. [arXiv] [Paper]
  • Ronan Fruit, Matteo Pirotta, Alessandro Lazaric and Ronald Ortner:
    Efficient Bias-Span-Constrained Exploration-Exploitation in Reinforcement Learning. ICML 2018, Stockholm, Sweden. [arXiv]
  • Matteo Papini, Damiano Binaghi, Giuseppe Canonaco, Matteo Pirotta and Marcello Restelli:
    Stochastic Variance-Reduced Policy Gradient. ICML 2018, Stockholm, Sweden. [arXiv] [Paper]
  • Andrea Tirinzoni, Andrea Sessa, Matteo Pirotta and Marcello Restelli:
    Importance Weighted Transfer of Samples in Reinforcement Learning. ICML 2018, Stockholm, Sweden. [arXiv] [Paper]
  • Davide Di Febbo, Emilia Ambrosini, Matteo Pirotta, Eric Rojas, Marcello Restelli, Alessandra Pedrocchi and Simona Ferrante:
    Does Reinforcement Learning Outperform PID in the Control of FES Induced Elbow Flex-Extension? MeMeA 2018, Rome, Italy.
  • Ronan Fruit, Matteo Pirotta, Alessandro Lazaric, and Emma Brunskill:
    Regret Minimization in MDPs with Options without Prior Knowledge. NIPS 2017, Long Beach, California, USA. [Poster] [Full Paper]
  • Alberto Metelli, Matteo Pirotta, and Marcello Restelli:
    Compatible Reward Inverse Reinforcement Learning. NIPS 2017, Long Beach, California, USA. [Poster] [Paper]
  • Matteo Papini, Matteo Pirotta, and Marcello Restelli:
    Adaptive Batch Size for Safe Policy Gradients. NIPS 2017, Long Beach, California, USA. [Poster] [Paper]
  • Davide Tateo, Matteo Pirotta, Andrea Bonarini and Marcello Restelli:
    Gradient-Based Minimization for Multi-Expert Inverse Reinforcement Learning. IEEE SSCI 2017, Hawaii, USA.
  • Samuele Tosatto, Matteo Pirotta, Carlo D'Eramo, and Marcello Restelli:
    Boosted Fitted Q-Iteration. ICML 2017, Sydney, New South Wales, Australia.
  • Carlo D'Eramo, Alessandro Nuara, Matteo Pirotta, and Marcello Restelli:
    Estimating the Maximum Expected Value in Continuous Reinforcement Learning Problems. AAAI 2017, San Francisco, California, USA.
  • Matteo Pirotta, and Marcello Restelli:
    Inverse Reinforcement Learning through Policy Gradient Minimization. AAAI 2016, Phoenix, Arizona, USA.
  • Matteo Pirotta, Simone Parisi, and Marcello Restelli:
    Multi-Objective Reinforcement Learning with Continuous Pareto Frontier Approximation. AAAI 2015, Austin, Texas, USA.
  • Caporale Danilo, Luca Deori, Roberto Mura, Alessandro Falsone, Riccardo Vignali, Luca Giulioni, Matteo Pirotta and Giorgio Manganini:
    Optimal Control to Reduce Emissions in Gasoline Engines: An Iterative Learning Control Approach for ECU Calibration Maps Improvement. ECC 2015, Linz, Austria.
  • Giorgio Manganini, Matteo Pirotta, Marcello Restelli, Luca Bascetta:
    Following Newton Direction in Policy Gradient with Parameter Exploration. IJCNN 2015, Killarney, Ireland.
  • Simone Parisi, Matteo Pirotta, Nicola Smacchia, Luca Bascetta, Marcello Restelli:
    Policy Gradient Approaches for Multi-Objective Sequential Decision Making: A Comparison. ADPRL 2014, Orlando, Florida, United States.
  • Simone Parisi, Matteo Pirotta, Nicola Smacchia, Luca Bascetta and Marcello Restelli:
    Policy Gradient Approaches for Multi-Objective Sequential Decision Making. IJCNN 2014, Beijing, China.
  • Matteo Pirotta, Giorgio Manganini, Luigi Piroddi, Maria Prandini and Marcello Restelli:
    A particle-based policy for the optimal control of Markov decision processes. IFAC 2014, Cape Town, South Africa.
  • Matteo Pirotta, Marcello Restelli, Luca Bascetta:
    Adaptive Step-Size for Policy Gradient Methods. NIPS 2013, Lake Tahoe, Nevada, USA.
  • Matteo Pirotta, Marcello Restelli, Alessio Pecorino, and Daniele Calandriello:
    Safe policy iteration. ICML 2013, Atlanta, Georgia, USA. [Paper]
  • Martino Migliavacca, Alessio Pecorino, Matteo Pirotta, Marcello Restelli, and Andrea Bonarini:
    Fitted Policy Search. ADPRL 2011, Paris, France.
  • Martino Migliavacca, Alessio Pecorino, Matteo Pirotta, Marcello Restelli, and Andrea Bonarini:
    Fitted Policy Search: Direct Policy Search using a Batch Reinforcement Learning Approach. ERLARS 2010, Lisboa, Portugal.
Journal Papers
  • Alberto Maria Metelli, Matteo Pirotta, Daniele Calandriello, Marcello Restelli:
    Safe Policy Iteration: A Monotonically Improving Approximate Policy Iteration Approach. JMLR 22(97), 2021. [Paper]
  • Simone Parisi, Matteo Pirotta and Jan Peters:
    Manifold-based Multi-objective Policy Search with Sample Reuse. Neurocomputing 263, 2017. [Paper]
  • Giorgio Manganini, Matteo Pirotta, Marcello Restelli, Luigi Piroddi, and Maria Prandini:
    Policy search for the optimal control of Markov decision processes: a novel particle-based iterative scheme. IEEE Transactions on Cybernetics 46, 2016. [Paper]
  • Simone Parisi, Matteo Pirotta and Marcello Restelli:
    Multi-objective Reinforcement Learning through Continuous Pareto Manifold Approximation. Journal of Artificial Intelligence Research 57, 2016. [Paper]
  • Matteo Pirotta, Marcello Restelli and Luca Bascetta:
    Policy Gradient in Lipschitz Markov Decision Processes. Machine Learning 100, 2015. [Paper]
Technical Reports
  • Pierre-Alexandre Kamienny, Matteo Pirotta, Alessandro Lazaric, Thibault Lavril, Nicolas Usunier, Ludovic Denoyer:
    Learning Adaptive Exploration Strategies in Dynamic Environments Through Informed Policy Regularization. arXiv:2005.02934, 2020. [arXiv]
  • Ronan Fruit, Matteo Pirotta and Alessandro Lazaric:
    Improved Analysis of UCRL2 with empirical Bernstein bounds. ALT Tutorial, 2019. [arXiv]
  • Jian Qian, Ronan Fruit, Matteo Pirotta and Alessandro Lazaric:
    Concentration Inequalities for Multinoulli Random Variables. ALT Tutorial, 2019. [arXiv]
  • Matteo Pirotta and Marcello Restelli:
    Cost-Sensitive Approach to Batch Size Adaptation for Gradient Descent. Optimizing the optimizers, NIPS 2016 Workshop, Barcelona, Spain. [arXiv]

Teaching

Reinforcement Learning - Fall 2020 - MVA - ENS Paris-Saclay
  • Piazza: Registration and online class discussion on piazza
Previous Classes
  • Reinforcement Learning - Spring 2020 - African Master’s in Machine Intelligence (AMMI) - Ghana
  • Reinforcement Learning - Fall 2019 - MVA - ENS Paris-Saclay
  • Reinforcement Learning - Fall 2018 - MVA - ENS Paris-Saclay
  • Reinforcement Learning - Fall 2017 - MVA - ENS Paris-Saclay