Blurring the Line between Structure and Learning for Adaptive Local Recognition - Evan Shelhamer (UC Berkeley)

Date

Host: Mark Schmidt

Title: Blurring the Line between Structure and Learning for Adaptive Local Recognition

Abstract: The visual world is vast and varied, but there is nevertheless ubiquitous structure. In this talk I will focus on incorporating locality and scale structure into deep networks for image-to-image tasks, and then examine how dynamic inference that adapts these structures to each input helps cope with variability. I will look at these directions through the lens of local recognition tasks that require inference of what and where. By composing structured Gaussian filters with free-form filters, and learning both, our approach optimizes and adapts receptive field size. In effect this controls the degree of locality during learning and inference: changes in our parameters would require changes in architecture for standard networks. Multi-step adaptivity, through gradient optimization of scale during inference, further improves accuracy and robustness. This kind of factorization points to a reconciliation of structure and learning, through which known visual structure is respected and unknown visual detail is learned freely.

Bio: Evan Shelhamer is a hot-off-the-press PhD from UC Berkeley advised by Trevor Darrell. His research focuses on computer vision and machine learning, in particular making visual structure differentiable and inference adaptive. His joint work on fully convolutional networks won best paper honorable mention at CVPR'15. He was the lead developer of the Caffe deep learning framework from version 0.1 to 1.0, and shared the Mark Everingham service award for Caffe at ICCV'17. Before Berkeley, he studied computer science (AI concentration) and psychology at University of Massachusetts Amherst advised by Erik Learned-Miller. He takes his coffee black.