The Fabric
The 3 parts of NDIF
Developed by Northeastern University in partnership with the NSF Delta AI high-performance computing cluster at the NCSA at University of Illinois Urbana-Champaign, NDIF consists of three major components.
A nationwide high-performance computing fabric
hosting the largest open pretrained machine learning models for transparent deep inference.
This National Deep Inference Fabric is a unique combination of GPU hardware and deep network
AI inference software that provides a remotely-accessible computing resource for scientists
to perform detailed and reproducible experiments on large AI systems on the fabric.
The fabric is designed for many scientists to efficiently and simultaneously share the same AI computing capacity to make efficient use of resources.
A novel open-source research software library
that enables scientists to develop and deploy new research methods
on AI models by creating intervention code that inspects, modifies and customizes AI model computations. Our library, called nnsight,
enables reproducible scientific experiments to be defined and executed
on both the shared large-scale fabric and on a scientist’s own smaller-scale computers.
A nationwide training program
to equip researchers and students in every part of the country to utilize NDIF to
unlock critical research problems in every field impacted by large-scale AI.
Developed together with the Public Interest Technology University Network,
a consortium of 63 universities and colleges, the NDIF training program will consist of online modules,
course materials, and in-person workshops hosted at multiple sites throughout the United States.
It will create a network of experts in a range of fields impacted by AI,
provide embedded expertise within their own institutions,
and help create a next-generation workforce equipped to understand and harness the mechanisms
and capabilities of the systems at the forefront of artificial intelligence.
What is Delta AI?
NDIF is powered by the high-performance computing capacity of Delta AI, a computing cluster
created at NCSA in 2024.
Taking advantage of next-generation NVIDIA graphics processors,
DeltaAI is part of Delta,
which is the highest-performance GPU computing resource
in the National Science Foundation portfolio. Delta AI is
tailored for artificial intelligence workloads such as
Large Language Models.
What is NNsight?
The
NNsight library
is an open-source toolkit developed by NDIF to support
research methods on AI models.
Building on the popular PyTorch ecosystem,
NNsight allows researchers to create code that inspects, modifies,
and customizes AI model computations. NNsight enables reproducible experiments both on a scientist's own smaller-scale computer
and remotely on the shared large-scale fabric.
What is PIT-UN?
NDIF has partnered with the
Public Interest Technology University Network
(PIT-UN),
a consortium of 63 universities and colleges, to conduct needs-gathering workshops and tutorials
open to all fields affected by AI.
We are seeking participation of not only computer scientists,
but also all researchers in science and social science who wish to investigate the mechanisms
of large-scale AI within their fields.
FAQ
You can start using NDIF today! NDIF is a four-year project running from 2024-2028, with many capabilities still to be developed, but an early version of NDIF is available for you to use today.
Visit our Getting Started page to learn how to participate now.
Your participation is an essential part of the project. By getting involved early, you get a jump start on using leading-edge AI research methods, and you can also help the NDIF team learn how to design the Fabric to be helpful for your research. We are also seeking users from outside of computer science, so if your work is touched by AI in any way, do not hesitate to get involved as an early adopter.
Commercial AI inference services such as ChatGPT, Claude, and Gemini only provide black-box access to large AI models. That is, you can send inputs to the services and they will give you outputs, but they do not give you access to observe or alter any of the internal computations.
In contrast, NDIF provides full transparency for AI inference, allowing users to fully examine and modify every step of the internal computation of large AI models. NDIF provides this access by allowing scientists to write programs using the nnsight library that can intervene in each layer and step of AI inference.
NDIF provides access to all activations and gradients on large-scale open AI models, along with interventions, model editing, custom optimizations, and many other research methods. These essential scientific capabilities are not available through commercial inference services.
If you use NNsight or NDIF resources in your research, please cite the following:
Citation
Jaden Fried Fiotto-Kaufman, Alexander Russell Loftus, Eric Todd, Jannik Brinkmann, Koyena Pal, Dmitrii Troitskii, Michael Ripa, Adam Belfki, Can Rager, Caden Juang, Aaron Mueller, Samuel Marks, Arnab Sen Sharma, Francesca Lucchetti, Nikhil Prakash, Carla E. Brodley, Arjun Guha, Jonathan Bell, Byron C Wallace, and David Bau. "NNsight and NDIF: Democratizing Access to Foundation Model Internals," ICLR 2025. Available at https://openreview.net/forum?id=MxbEiFRf39.
In addition, when you publish work using NNsight or NDIF resources, we'd love you to email us directly at info@ndif.us to tell us about your work. This helps us track our impact and supports our continued efforts to provide open-source resources for reproducible and transparent research on large-scale AI systems.
Supercomputing, high-performance computing, and cloud computing infrastructure services are designed to support coarse-grained computing jobs, and they do not natively support fine-grained sharing of pretrained AI models. For example, if you wish to study a large AI model using an HPC system, you need to allocate a large amount of GPU capacity for your own exclusive use, which is costly, and which can put scarce resources out of reach for most users.
NDIF provides a shared deep inference fabric, which allows many users to access shared AI models in a fine-grained manner, making economical use of precious hardware resources. Instead of submitting an entire high-performance computing job, NDIF users submit specialized deep inference tasks which may run for as briefly as fraction of a second, exploiting the same preloaded AI models that may be studied by other users simultanously.
NDIF will provide an autoscaling scheduler that provides fair and efficient access to the most in-demand open models, allowing individual users to customize every aspect of inference while protecting and isolating different users from each other.
NDIF's API, NNsight, is built on PyTorch, so using NDIF will be familiar to any PyTorch user. For example, you can use it to load, run, inspect, and modify any PyTorch neural network such as open-source models from HuggingFace.
However, instead of running models only locally, NNsight defines Python contexts in which models can be run with interventions and customizations that are defined locally, but which can be executed either locally or remotely.
This enables a workflow where you can learn about and develop experimental methods at a small scale, using your own local hardware resources, and then deploy your experiments using the very same code at large scale, probing massive neural networks using NDIF.
To learn more, check out the NNsight website.
NNsight, the open-source software underlying NDIF, is available to be used both inside and outside the U.S., and it can be used with your own computing hardware.
The U.S. NSF is providing computing resources to support large-scale scientific studies. These resources will be made to educational and research users with a U.S. affiliation after account creation via the CILogin system. There will also eventually be a way to allocate and reserve high-capacity blocks for very large-scale projects.
NDIF is still in active development. To get involved as an early alpha-testing user, read our Getting Started resources and join our community.