UC San Diego Leverages NVIDIA DGX B200 for Advanced AI Research

Iris Coleman   Dec 18, 2025 01:02  UTC 17:02

0 Min Read

The University of California San Diego's Hao AI Lab has recently integrated the NVIDIA DGX B200 system into its research arsenal, significantly advancing its efforts in large language model (LLM) inference, according to NVIDIA's blog. This acquisition places the lab at the forefront of AI innovation, enhancing the speed and efficiency of their projects.

Enhancing Research Capabilities with DGX B200

The introduction of the DGX B200 system at UC San Diego's School of Computing, Information and Data Sciences, including the San Diego Supercomputer Center, opens up expansive research opportunities. Hao Zhang, an assistant professor at the Halıcıoğlu Data Science Institute, highlighted the system's world-class performance, allowing for faster prototyping and experimentation compared to previous technologies.

Among the projects benefiting from this advanced system are FastVideo and the Lmgame benchmark. FastVideo aims to train video generation models capable of producing video content from textual prompts within mere seconds. Additionally, the Lmgame-bench suite evaluates LLMs through popular games like Tetris and Super Mario Bros, allowing for performance comparisons between different models.

Innovative Approaches to AI Inference

The Hao AI Lab is also pioneering in low-latency LLM serving, utilizing the DGX B200 to explore new frontiers in real-time responsiveness. Junda Chen, a doctoral candidate at UC San Diego, emphasized the exploration of low-latency serving on the advanced hardware provided by the DGX B200.

DistServe's Impact on Disaggregated Serving

Disaggregated inference, a concept developed by the Hao AI Lab, optimizes system throughput and latency, crucial for large-scale LLM-serving engines. The lab introduced the metric of "goodput," which balances throughput and user latency, ensuring efficient and high-quality AI model performance.

The process involves separating prefill and decode tasks onto different GPUs, reducing resource competition and enhancing speed. This innovation, termed prefill/decode disaggregation, allows for workload scaling without sacrificing quality or latency.

Cross-Disciplinary Collaborations

Beyond these projects, UC San Diego is engaging in cross-departmental collaborations across fields such as healthcare and biology, utilizing the DGX B200 to optimize diverse research initiatives. These efforts underscore the university's commitment to leveraging AI platforms for groundbreaking innovation.

For more detailed insights into the NVIDIA DGX B200 system, visit the NVIDIA blog.



Read More