PhD Thesis Defence - Christopher Cameron

Date

Name: Christopher Cameron
Date: Wednesday, March 12th, 2025
Time: 11:30 am - 2:30 pm
Location: X836
Zoomhttps://ubc.zoom.us/j/61327469711?pwd=k30CQZ15uglJLOeLE4jEVMAPKUxb5H.1
Supervisor's name: Kevin Leyton-Brown


Thesis Title: ML-Empowered Combinatorial Search
Abstract:
Combinatorial search problems underlie a wide range of computational tasks, from resource allocation in 5G networks and spectrum auctions, to hardware verification and power-grid scheduling. Despite the best-known techniques having worst-case exponential complexity, many real-world instances exhibit rich structural patterns that specialized solvers can exploit. No single solver excels across all problem distributions, and tuning solvers to specific distributions is crucial for best performance.

When done by hand, this application-specific algorithm-design process is very tedious and time consuming. This thesis explores how \emph{machine learning} (ML) can empower combinatorial search by integrating data-driven methods with symbolic solvers, maintaining correctness guarantees while adapting to specific instance distributions. We begin by exploring evaluation and applications of two classic ML approaches for algorithm design: algorithm selection and configuration. We later relax two limitations of these classic approaches: (1) reliance on hand-crafted features to represent problem instances; and (2) a black-box view of one or more solvers. We relax the first constraint by learning an end-to-end neural representation of problem instances that captures the right invariances for reasoning about Boolean logic. We relax the second constraint by integrating ML into the solver’s internal decisions, proposing a new reinforcement learning approach, Monte Carlo Forest Search (MCFS), to learn improved branching policies in tree search.

With these new advances, learning models for each new application typically demands extensive training data. We show it is possible to train foundation models that dramatically reduce the overhead of learning new tasks and distributions. We trained a single model on a large dataset across a broad range of combinatorial tasks that generalized both on held-out tasks and distributions. We found that models fine-tuned from the foundation models were substantially more sample efficient than models trained from scratch.

Finally, we consider the predict-then-optimize paradigm, showing that coupling predictive models with the solver objective can prevent large degradations in solution quality. We formalize this in the setting where forecasts are inherently stochastic and multiple correlated forecasts drive downstream decisions.