HCII PhD Thesis Proposal: David Lin

Open in new window

When
October 29, 2025 1:00pm - October 29, 2025 3:00pm

Where
Gates-Hillman Center 6115

Description

Intuitive and Controllable AI Steering Interfaces

David Chuan-En Lin

HCII PhD Thesis Proposal

Date & Time: Wednesday, October 29th at 1:00pm ET

Location: GHC 6115

Zoom: https://cmu.zoom.us/j/93693703552?pwd=FbuP7pqJ80bCNrYH8P7lfmM9YxZsCJ.1 (Meeting ID: 936 9370 3552 | Passcode: 631129)

Committee:

Nikolas Martelaro (Chair), Carnegie Mellon University

Aniket (Niki) Kittur, Carnegie Mellon University

David Lindlbauer, Carnegie Mellon University

Michael Terry, Google DeepMind

Abstract:

A longstanding challenge in Computer Science is transforming AI from passive automation machines into interactive systems that users can actively steer and control. AI steering interfaces are a key pillar in realizing this goal, giving users a controllable interfacing layer to guide AI behavior. Current AI interaction methods - such as text prompts and basic parameter sliders - offer initial glimpses of this potential, providing rudimentary ways to influence AI systems. While these elementary approaches already deliver value to millions of users worldwide, substantial opportunities remain for more expressive control mechanisms. Yet designing effective steering interfaces faces a fundamental trade-off between intuitiveness and controllability. As interfaces become more intuitive (one-click solutions, text boxes), control diminishes, while interfaces that maximize control (programming, complex editors) become expert-only tools. Most conventional approaches fall along this spectrum, limiting users to either ease of use or expressive control.

My research aims to design AI steering interfaces that break the traditional intuitiveness-controllability trade-off. By systematically mapping how people interact (HCI interface paradigms) to how models can be controlled (controllable mechanisms of AI models), I identify promising combinations that shift toward high intuitiveness and high controllability simultaneously. I introduce six systems demonstrating this mapping: Soundify pairs object-centric control with class activation maps, VideoMap combines map-based overviews with latent space navigation, Jigsaw connects flow-based interfaces to model orchestration, PseudoClient links example-based interaction to few-shot meta-learning, Inkspire blends sketch-based iteration with structural conditioning, and StyleGenome (proposed work) integrates evolutionary interfaces with parameter-efficient adaptation. These systems demonstrate that expressive AI control can be made both intuitive for novices and powerful for experts, enabling graphic designers to personalize AI using their own work, video editors to navigate large-scale video content spatially, and product designers to iterate through broad design spaces by sketching.

Document Link: https://docs.google.com/document/d/1LxAGutm4iip4LSqQYw2rFiASvq8G_OChce_8RZX-avw/edit?usp=sharing