Language Technologies Thesis Proposal
Language Technologies Institute
Carnegie Mellon University
Virtual Presentation - ET
Despite their growing ubiquity, learned neural models remain opaque with neural net researchers applying what might be considered esoteric and arcane knowledge and skills to understand what the models are learning and how the internal workings of the models change their learning outcomes. Understanding what these models are learning is a field of utmost importance in the current age as more and more production systems rely on neural models to provide more and more high-impact utilities. This work proposes a shift in design philosophy for neural models that redefines the unit of analysis for these models from individual neurons to a set of interconnected functional components which we call neural pathways. These functional components, which are a consequence of the architecture, data, and training scheme, have the capacity to cut across structural boundaries. This enables a method of human-in-the-loop model selection through increased transparency, encouraging a dialogue between the models and the researchers. Over the course of this proposal and the proposed work for this thesis, we contribute to the literature in four ways: First, we provide the method for neural model interpretability at the subtask level, rigorously validating it against a suite of synthetic datasets. Second, we extend the method by providing a framework for aligning learned functional components to causal structures. This enables the comparison of the learned functions of a neural model with a theoretical causal structure allowing for rapid validation of our understanding of how a neural model is approaching a task. Third, we expand the method to compare and align functional components across models with differing architectures or training procedures. And lastly, we demonstrate the capabilities of the neural pathways approach in the domains of education technologies and fairness in machine learning. This includes automatic essay feedback via rhetorical structure analysis, group formation via transactivity detection, and fair distribution of resources for educational crowdfunding. This last contribution can be further specified into three facets separated by their domains and foci. First, neural pathways are employed to scaffold a neural discourse parser to more easily generalize to student writing. Next, we demonstrate that neural pathways can be used as a method for error analysis by exploring the discrepancy in performance between models trained on detecting transactivity in different domains. And lastly, we propose to enhance the ability of researchers to make informed model selection choices to avoid harmful biases by determining the importance of these biases to individual models. With the broad applicability of the neural pathways approach, we are optimistic that the method can have a wide impact on the the design and development of neural models and we aim to provide a foundational work that has the capability of being extended far beyond the scope of the thesis. Thesis Committee: Carolyn Rose (Chair) Emma Strubell Rayid Ghani Yonatan Belinkov (Technion | Israel Institute of Technology) Additional Information
Zoom Participation. See announcement.