Search Results for "ai infrastructure"
NVIDIA cuTile Python Guide Shows 90% cuBLAS Performance for Matrix Ops
NVIDIA releases detailed cuTile Python tutorial for Blackwell GPUs, demonstrating matrix multiplication achieving over 90% of cuBLAS performance with simplified code.
OpenAI and SoftBank Pour $1B Into SB Energy for Stargate Data Centers
OpenAI and SoftBank each invest $500M in SB Energy to build 1.2GW Texas data center, advancing the $5 trillion Stargate AI infrastructure initiative.
Harvey AI Overhauls Design System as Legal Tech Unicorn Scales Operations
Harvey AI rebuilt its design infrastructure from scratch to support rapid product expansion across web, mobile, and Microsoft Office platforms.
OpenAI Brings Ads to ChatGPT Free Tiers as $1.4T Infrastructure Bill Looms
OpenAI will test advertisements in ChatGPT's free and $8/month Go tiers in the coming weeks, keeping premium subscriptions ad-free amid massive infrastructure costs.
GPU Waste Crisis Hits AI Production as Utilization Drops Below 50%
New analysis reveals production AI workloads achieve under 50% GPU utilization, with CPU-centric architectures blamed for billions in wasted compute resources.
AI Inference Costs Drop 40% With New GPU Optimization Tactics
Together AI reveals production-tested techniques cutting inference latency by 50-100ms while reducing per-token costs up to 5x through quantization and smart decoding.
FlashAttention-4 Hits 1,605 TFLOPS on NVIDIA Blackwell GPUs
NVIDIA's FlashAttention-4 achieves 71% hardware efficiency on Blackwell chips, delivering 3.6x speedup over FA2 for AI training workloads.
Oracle Confirms OpenAI as Tenant for $165B New Mexico AI Data Center Project
Oracle reveals it will deploy AI infrastructure for OpenAI at Project Jupiter, a massive data center campus backed by $165B in industrial revenue bonds.
Oracle Ramps Up AI Data Center Push With $50B CapEx Plan
Oracle outlines 2026 AI infrastructure expansion across Texas, New Mexico, Wisconsin, and Michigan with commitments to local power and hiring.
NVIDIA Megatron Core Gets Dynamic-CP Update With 48% Training Speedups
NVIDIA releases Dynamic Context Parallelism for Megatron Core, achieving up to 1.48x faster LLM training and 35% gains in industrial deployments.
NVIDIA Run:ai v2.24 Tackles GPU Scheduling Fairness for AI Workloads
NVIDIA's new time-based fairshare scheduling prevents GPU resource hogging in Kubernetes clusters, addressing critical bottleneck for enterprise AI deployments.
NVIDIA Unveils Universal Sparse Tensor Framework for AI Efficiency
NVIDIA introduces Universal Sparse Tensor (UST) technology to standardize sparse data handling across deep learning and scientific computing applications.