30e séminaire POC

SPOC 30 "Integer Optimization for Machine Learning"

Le Mardi 07 Avril 2026

Integer Optimization for Machine Learning

Tuesday, April 07th, 2026

 

At the CNAM, Paris
2 rue du Conté, Amphitheater Gaston Planté (access 35, 1st floor)

L'entrée est gratuite, mais merci de vous inscrire en cliquant ici: Registration

See the list of the participants here


9h00 - 09h30  Accueil - Welcome


09h30 - 10h15  Diego Delle Donne (ESSEC business school)

Optimization models and algorithms for hyper-rectangular clustering problems

Machine Learning techniques are widely used to analyze large datasets and support prediction and decision-making. However, many models lack interpretability, as their underlying rules are difficult to explain. This has motivated increasing interest in explainable learning models. In unsupervised learning, clustering methods partition data into groups but rarely provide explicit reasons for point membership. Hyper-rectangle clustering addresses this limitation by defining each cluster as the smallest axis-aligned hyper-rectangle containing its points, thus providing clear geometric rules (coordinate-wise bounds) that explain cluster assignments. Given a set of points in the d-dimensional space, the goal is to find a hyper-rectangle clustering of minimum size. We first propose mixed-integer programming (MIP) formulations for the problem; a compact formulation and an extended one which is solved by means of a branch-and-price algorithm. Since these formulations become computationally limited for larger instances, we develop an incremental exact strategy that exploits their ability to optimally solve small instances. The method starts with a subset of points and iteratively adds points until all are covered; we prove that once full coverage is achieved, the solution is optimal for the original problem. Finally, computational experiments show that the proposed approach significantly extends the size of instances that can be solved to optimality.


10h15 - 10h45  Pause café - Coffe break


10h45 - 11h30  Margot Boyer (CEDRIC laboratory, CNAM)

Fast SDP certification of neural networks : towards large multiclass datasets


11h30 - 12h15  Mohamed Siala (LAAS laboratory)

Trustworthy Machine Learning via Combinatorial Optimization and Automated Reasoning

As machine learning (ML) is increasingly deployed in high-stakes decision-making, and as legal frameworks for trustworthy ML such as the European AI Act continue to emerge, the need for ML systems with formal guarantees is becoming more pressing. At the same time, these systems must remain computationally efficient and maintain high predictive quality. In this talk, we will discuss key challenges and opportunities in trustworthy ML, with a particular focus on problems at the intersection of combinatorial optimisation and automated reasoning.


12h15 - 14h00  Repas - Lunch


14h00 - 14h45  Farnaz Farzadnia (Copenhagen business school)

Cluster Analysis of Bicycle Lane Safety: An Explainable Approach


14h45 - 15h15  Pause café - Coffe break


15h15 - 16h00  Ilaria Ciocci (Sapienza University of Rome)

Margin Optimal Trees for Nonlinear Regression

Interpretable machine learning models have recently attracted growing interest due to their transparent decision-making process. Among these, decision trees are widely studied thanks to their intuitive structure and inherent interpretability. Advances in mixed-integer programming over the last few decades have led to the development of several formulations for building Optimal Decision Trees, offering an alternative to traditional greedy heuristics. Along this research line, we extend to regression tasks the Margin Optimal Classification Tree approach recently introduced by Monaci et al. (2024), which employs margin-based multivariate hyperplanes nested in the binary tree structure. This leads to a quadratic mixed-integer formulation for building Margin Optimal Regression Trees, designed to perform robust nonlinear regression. In particular, the model embeds Support Vector Regression (SVR) models in the leaf nodes to define prediction hyperplanes, exploiting the robustness and generalization capabilities of SVRs within an optimal tree framework. Moreover, symmetry-breaking inequalities are introduced to reduce equivalent solutions. To assess the effectiveness of the proposed approach, we conduct computational experiments on benchmark datasets, comparing it with baseline optimal tree methods. Finally, we investigate a Benders-like decomposition scheme to tackle the computational difficulties of the formulation, exploiting its naturally decomposable structure.


16h00 - 16h45  Thomas Halskov (Copenhagen business school)

Collective LIME: A Global View of Local Surrogate Models


Organizer: Zacharie Ales et Sebastien Martin