The NDIF Engineering Fellowship

Summer 2024

The National Deep Inference Fabric (NDIF) is awarding Summer Engineering Fellowships to participate in a high-priority summer project to develop and rapidly scale AI research infrastructure to keep pace with the precipitous increase in scale of state-of-the art AI.

NDIF is an NSF-funded project to create an innovative, highly-transparent large-scale AI inference infrastructure to enable scientific research on the largest open AI models. Read more about the NDIF project and team here, and read about the NNsight API architecture here. This twitter thread also provides some more context.

NDIF has recently received NSF support, including sufficient GPU resources to create a scalable service. This summer the project is embarking on a rapid summer initiative to implement:

  • High-throughput parallelism for a uniquely transparent large-scale LLM inference service.
  • Service autodeployment for public scientific use on NCSA HPC GPU clusters.
  • Robustness and monitoring for reliable serving for research users.

Before Fall 2024, open LLM models are anticipated that will be five to ten times larger than the current state-of-the art. Accordingly, scientific infrastructure to support research at that scale will be urgently needed. The NDIF Summer Engineering Fellowship is an intensive program for systems- and ML-focused students to contribute to an active state-of-the-art AI research engineering project in the public interest.

During your fellowship you will work under the supervision of a team of leading machine learning and software engineering systems professors. Your work in the fellowship will be essential for scaling up the innovative NDIF inference infrastructure to prepare the groundwork to enable interpretability, safety auditing, and transparent research at a scale that matches the pace of the ongoing increase in AI model complexity.

The program will be held in-person in Boston during Summer 2024.


The NDIF engineering fellowship accepts applications from current and recent PhD, Masters, and Undergraduate students in computer science and related fields. The ideal applicant will have knowledge of machine learning systems, such as quantized and low-precision machine learning, ML compilers, and architectures for low-latency transformer inference.

Both systems-focused and machine-learning-focused applicants will be considered.

Deadline to apply is May 26, 2024. We are unable to accept applicants who will be under the age of 18 on June 1, 2024.

How to apply

To apply, submit your contact information, CV, current transcript, and an explanation of your interest and potential contributions to the project using this form.