Search Results for "ai infrastructure"
NVIDIA Dynamo 1.0 Ships With 7x Inference Boost for AI Data Centers
NVIDIA releases Dynamo 1.0, an open-source inference OS adopted by AWS, Azure, Google Cloud, and major AI companies. Claims 7x performance gains on Blackwell GPUs.
NVIDIA DGX Spark Now Scales to 4 Nodes for 700B Parameter AI Agents
NVIDIA expands DGX Spark to support 4-node configurations, enabling local inference of 700B parameter models and near-linear fine-tuning performance scaling.
NVIDIA Unveils AI Grid Architecture for Distributed Edge Inference at GTC 2026
NVIDIA's AI Grid reference design enables telcos to cut inference costs by 76% and meet sub-500ms latency targets through distributed edge computing.
Together AI Upgrades Fine-Tuning Platform With Vision and Reasoning Support
Together AI adds tool calling, reasoning traces, and vision-language fine-tuning to its platform, with 6x throughput gains for 100B+ parameter models.
NVIDIA Advances AI Infrastructure With Disaggregated LLM Inference on Kubernetes
NVIDIA details new Kubernetes deployment patterns for disaggregated LLM inference using Dynamo and Grove, promising better GPU utilization for AI workloads.
NVIDIA Donates GPU Resource Driver to Kubernetes Open Source Project
NVIDIA transfers critical GPU allocation software to CNCF at KubeCon Europe, marking major shift toward community-governed AI infrastructure.
Ray Serve Upgrade Delivers 88% Lower Latency for AI Inference at Scale
Anyscale announces major Ray Serve optimizations with HAProxy and gRPC, achieving 11.1x throughput gains for LLM inference workloads on enterprise deployments.
NVIDIA Claims 1 Million X Efficiency Gains Across Six GPU Generations
NVIDIA details how Vera Rubin platform delivers 10x higher inference throughput per megawatt, reshaping AI data center economics and token factory revenue models.
NVIDIA MIG Boosts AI Infrastructure ROI by 33% Over Time-Slicing
New NVIDIA benchmarks show Multi-Instance GPU partitioning achieves 1.00 req/s per GPU versus 0.76 for time-slicing in production AI workloads.
Filecoin (FIL) Onchain Cloud Hits Mainnet With 49 TiB Already Stored
Filecoin (FIL) launches programmable cloud storage for AI agents with onchain proofs, automatic payments, and two-copy replication at $2.50/TiB monthly.
Oracle Brings NVIDIA B300 GPUs and xAI Grok to Government Cloud Regions
Oracle expands AI infrastructure for U.S. government customers with NVIDIA Blackwell Ultra GPUs and xAI Grok models in secure cloud regions.
Bitfarms Becomes Keel Infrastructure, Completes Delaware Move Amid Bitcoin Exit
Former Bitcoin miner Bitfarms officially rebrands as Keel Infrastructure, completing U.S. redomiciliation as it pivots to 2.2GW AI data center business.