NVIDIA has introduced AI Foundry, a service designed to help enterprises create and deploy custom generative AI models tailored to their specific needs. This service leverages data, accelerated computing, and advanced software tools, according to the NVIDIA Blog.
Industry Pioneers Drive AI Innovation
Leading companies such as Amdocs, Capital One, Getty Images, KT, Hyundai Motor Company, SAP, ServiceNow, and Snowflake are early adopters of NVIDIA AI Foundry. These industry pioneers are setting the stage for a new era of AI-driven innovation in enterprise software, technology, communications, and media.
Jeremy Barnes, Vice President of AI Product at ServiceNow, emphasized the competitive edge that custom models provide. “Organizations deploying AI can gain a competitive edge with custom models that incorporate industry and business knowledge,” Barnes stated. “ServiceNow is using NVIDIA AI Foundry to fine-tune and deploy models that can integrate easily within customers’ existing workflows.”
The Pillars of NVIDIA AI Foundry
NVIDIA AI Foundry is built on several key pillars: foundation models, enterprise software, accelerated computing, expert support, and a broad partner ecosystem. The service includes AI foundation models from NVIDIA and the AI community, as well as the complete NVIDIA NeMo software platform for rapid model development.
The computing backbone of NVIDIA AI Foundry is the NVIDIA DGX Cloud, a network of accelerated compute resources co-engineered with leading public clouds like Amazon Web Services, Google Cloud, and Oracle Cloud Infrastructure. This setup allows AI Foundry customers to develop and fine-tune custom generative AI applications efficiently and scale their AI initiatives without significant upfront investments in hardware.
Additionally, NVIDIA AI Enterprise experts are available to assist customers through each step of building, fine-tuning, and deploying their models with proprietary data, ensuring alignment with business requirements.
Global Ecosystem and Partner Support
NVIDIA AI Foundry customers benefit from a global ecosystem of partners offering comprehensive support. Consulting services from partners like Accenture, Deloitte, Infosys, and Wipro include design, implementation, and management of AI-driven digital transformation projects. For example, Accenture has introduced its own AI Foundry-based offering, the Accenture AI Refinery framework.
Service delivery partners such as Data Monsters, Quantiphi, Slalom, and SoftServe help enterprises navigate the complexities of integrating AI into their existing IT landscapes, ensuring that AI applications are scalable, secure, and aligned with business objectives.
Customers can develop NVIDIA AI Foundry models for production using AIOps and MLOps platforms from partners like Cleanlab, DataDog, Dataiku, Dataloop, DataRobot, Domino Data Lab, Fiddler AI, New Relic, Scale, and Weights & Biases. These models can be deployed as NVIDIA NIM inference microservices, which include the custom model, optimized engines, and a standard API to run on preferred accelerated infrastructure.
Inferencing solutions like NVIDIA TensorRT-LLM enhance efficiency for Llama 3.1 models, minimizing latency and maximizing throughput. This allows enterprises to generate tokens faster while reducing the total cost of running models in production, supported by the NVIDIA AI Enterprise software suite.
Moreover, Together AI announced that it will enable its ecosystem of over 100,000 developers and enterprises to use its NVIDIA GPU-accelerated inference stack to deploy Llama 3.1 endpoints and other open models on DGX Cloud.
“Every enterprise running generative AI applications wants a faster user experience, with greater efficiency and lower cost,” said Vipul Ved Prakash, founder and CEO of Together AI. “Now, developers and enterprises using the Together Inference Engine can maximize performance, scalability, and security on NVIDIA DGX Cloud.”
NVIDIA NeMo Simplifies Custom Model Development
NVIDIA NeMo, integrated into AI Foundry, provides developers with tools to curate data, customize foundation models, and evaluate performance. NeMo technologies include:
- NeMo Curator: A GPU-accelerated data-curation library that enhances generative AI model performance by preparing large-scale, high-quality datasets for pretraining and fine-tuning.
- NeMo Customizer: A scalable microservice that simplifies fine-tuning and alignment of large language models (LLMs) for domain-specific use cases.
- NeMo Evaluator: Automatically assesses generative AI models across academic and custom benchmarks on any accelerated cloud or data center.
- NeMo Guardrails: Manages dialog, supporting accuracy, appropriateness, and security in smart applications with large language models.
With these tools, businesses can create custom AI models that are precisely tailored to their needs, improving alignment with strategic objectives, accuracy in decision-making, and operational efficiency.
Philipp Herzig, Chief AI Officer at SAP, noted, “As a next step of our partnership, SAP plans to use NVIDIA’s NeMo platform to help businesses accelerate AI-driven productivity powered by SAP Business AI.”
Custom Models Drive Competitive Advantage
NVIDIA AI Foundry addresses the unique challenges enterprises face in adopting AI. While generic AI models may fall short of meeting specific business needs and data security requirements, custom AI models offer superior flexibility, adaptability, and performance. This makes them ideal for enterprises seeking a competitive edge.
“Safe, trustworthy AI is a non-negotiable for enterprises harnessing generative AI, with retrieval accuracy directly impacting the relevance and quality of generated responses in RAG systems,” said Baris Gultekin, Head of AI at Snowflake. “Snowflake Cortex AI leverages NeMo Retriever, a component of NVIDIA AI Foundry, to further provide enterprises with easy, efficient, and trusted answers using their custom data.”
For more information on how NVIDIA AI Foundry can boost enterprise productivity and innovation, visit NVIDIA AI Foundry.
Image source: Shutterstock