Modular Deep Learning
Overview of the project
The Modular Deep Learning project explores a class of machine learning algorithms known as deep learning, which has gained significant attention in academia and industry. Traditional deep learning models are typically monolithic, with layers arranged sequentially to transform input data into high-level, interpretable output. This monolithic nature of deep neural networks (DNNs) limits their fine-grained reusability and adaptability. This project adopts a groundbreaking approach by decomposing DNN models into self-contained, independent modules, facilitating reusability and adaptability. Such modularization enables fine-grained reuse and replacement of components, potentially eliminating the need for complete retraining or coarse-grained reuse in building new AI solutions or repairing faulty system parts. This approach not only promises cost savings by reducing retraining requirements but also translates into energy efficiency, which is especially relevant for maintaining today’s ultra-large AI models, such as large language models (LLMs). This is particularly relevant in the era of ultra-large AI models, such as large language models (LLMs), which are reported to consume excessive amounts of water, leave a huge carbon footprint, and draw vast amounts of energy to operate. The modular approach envisioned by this project has the potential to streamline the use of such large-scale AI systems, paving the way for greater energy efficiency in their creation and upkeep.
Preliminary Investigation
Our preliminary work has shown that it is possible to decompose various neural architectures effectively and conceptualize the notion of modules. Specifically, we found that fully connected neural networks, CNNs, and RNNs can be decomposed into modules, allowing for the reuse of existing components and potentially avoiding costly retraining in certain scenarios. We explored:
- How AI models can be constructed by combining existing modules and
- How large, costly-to-train AI models can be repaired rather than discarded and retrained from scratch.
Current Focus
The main goal of this project is to expand these initial findings along three dimensions further.
- First, we aim to explore the role of modularity in the dual context of AI for Energy and Energy for AI—examining how AI can optimize energy systems and how energy constraints shape AI development and deployment. By applying modular design principles, we aim to reduce both the costs and energy demands of training deep neural networks, making AI technologies more sustainable and efficient.
- Second, we aim to develop standardized communication interfaces for seamlessly integrating AI modules or models to facilitate their independent evolution. Interfaces, as they do in traditional software engineering, define a boundary or contract that encapsulates functionality and promotes the decoupling of system components. With carefully defined interfaces, DNN modules can evolve independently as long as they adhere to the established interface contract. Such a methodology would dramatically reduce the overhead associated with training and integrating new DNN models, much like how libraries and APIs have streamlined software development.
- Third, we aim to introduce contracts or specifications into the modular AI landscape, which could facilitate better error diagnosis, blame assignment, and modular validation, allowing for the creation of more reliable and robust DNN models. Contracts, in a software context, set clear expectations for the behavior of a module, dictating what it should accomplish given specific inputs. In the DNN realm, this idea can be applied to individual DNN modules within a model, defining what kind of output should be produced given certain inputs.
Relevant Publications
Following research papers document progress on this project:Modular Deep Learning project has been supported in part by the following grants.