Vision and Autonomous Systems Seminar
Assistant Professor of Computer Science and Engineering
University of Michigan
Virtual Presentation - ET
The long-term goal of my research is to help computers understand the physical world from images, including both 3D properties and how humans or robots could interact with things. This talk will summarize two recent directions aimed at enabling this goal.
I will begin with learning to reconstruct full 3D scenes, including invisible surfaces, from a single RGB image and present work that can be trained with the ordinary unstructured 3D scans that sensors usually collect. Our method uses implicit functions where a distance is predicted densely through a scene. These implicit functions have shown great results when trained with supervision from watertight meshes and when fitting to one particular scene but results are scant for generalizing to new scenes while training on non-watertight meshes. Our results suggest the conventional setup in the non-watertight setting actually incentivizes neural networks to make predictions that are systematically distorted. We have a simple solution in the form of a distance-like function that is more amenable to prediction. Networks trained to estimate this function produce strong results for full scene reconstruction on Matterport3D and other datasets.
I will then focus on understanding what humans are doing with their hands. Hands are a primary means for humans to manipulate the world, but fairly basic information about what they're doing is often off limits to computers (or, at least in challenging data). I'll describe some of our efforts on understanding hand state, including work on learning to segment hands and hand-held objects in images via a system that learns from large-scale video data.
David Fouhey is an assistant professor at the University of Michigan. He received a Ph.D. in robotics from Carnegie Mellon University and was then a postdoctoral fellow at UC Berkeley. His work has been recognized by a NSF CAREER award, and NSF and NDSEG fellowships. He has spent time at the University of Oxford’s Visual Geometry Group, INRIA Paris, and Microsoft Research. Sponsored in part by Facebook Reality Labs Pittsburgh Zoom Participation. See announcement.