Search Results for "ai infrastructure"
NVIDIA Integrates CUDA Tile Backend for OpenAI Triton GPU Programming
NVIDIA's new CUDA Tile IR backend for OpenAI Triton enables Python developers to access Tensor Core performance without CUDA expertise. Requires Blackwell GPUs.
Together AI Opens Evaluations to OpenAI, Anthropic, Google Models
Together Evaluations now benchmarks proprietary AI models from OpenAI, Anthropic, and Google against open-source alternatives, claiming 10x cost savings.
Mistral AI Launches Voxtral Transcribe 2 With Sub-200ms Latency
Mistral releases Voxtral Transcribe 2 with real-time streaming at $0.003/min, undercutting competitors while matching accuracy. Open weights under Apache 2.0.
CleanSpark CLSK Posts $181M Q1 Revenue But Swings to $379M Net Loss
CleanSpark reports Q1 fiscal 2026 revenue of $181.2M with $378.7M net loss as Bitcoin fair value swings hit earnings. Stock down 8.8% ahead of results.
VLA Models Reshape Robotics as $94B Market Embraces AI Infrastructure
Vision-Language-Action models are driving robotics teams to Ray and Anyscale for distributed training. Market projected to hit $94.38B by 2031.
Together AI Achieves 40% Faster LLM Inference With Cache-Aware Architecture
Together AI's new CPD system separates warm and cold inference workloads, delivering 35-40% higher throughput for long-context AI applications on NVIDIA B200 GPUs.
NVIDIA Blackwell Ultra GB300 Delivers 50x Performance Boost for AI Agents
NVIDIA's GB300 NVL72 systems show 50x better throughput per megawatt and 35x lower token costs versus Hopper, with Microsoft, CoreWeave deploying at scale.
NVIDIA Secures Massive Meta AI Deal for Millions of Blackwell and Rubin GPUs
Meta commits to multiyear NVIDIA partnership deploying millions of GPUs, Grace CPUs, and Spectrum-X networking across hyperscale AI data centers.
India Deploys 20,000 NVIDIA Blackwell GPUs in $1B AI Infrastructure Push
India partners with NVIDIA to build sovereign AI infrastructure with 20,000+ Blackwell Ultra GPUs, targeting $27.7B market by 2032 under IndiaAI Mission.
NVIDIA Run:ai GPU Fractioning Delivers 77% Throughput at Half Allocation
NVIDIA and Nebius benchmarks show GPU fractioning achieves 86% user capacity on 0.5 GPU allocation, enabling 3x more concurrent users for mixed AI workloads.
NVIDIA MIG Tech Delivers 2.25x Speedups for Power-Constrained AI Workloads
NVIDIA's Multi-Instance GPU technology shows up to 2.25x performance gains for data center workloads under power limits, with implications for AI infrastructure costs.
Together AI's CDLM Achieves 14.5x Faster AI Inference Without Quality Loss
Consistency Diffusion Language Models solve two critical bottlenecks in AI inference, delivering up to 14.5x latency improvements while maintaining accuracy on coding and math tasks.
