Search Results for "llm"
AI21 Labs Unveils Jamba 1.5 LLMs with Hybrid Architecture for Enhanced Reasoning
AI21 Labs introduces Jamba 1.5, a new family of large language models leveraging hybrid architecture for superior reasoning and long context handling.
NVIDIA Introduces Efficient Fine-Tuning with NeMo Curator for Custom LLM Datasets
NVIDIA's NeMo Curator offers a streamlined method for fine-tuning large language models (LLMs) with custom datasets, enhancing machine learning workflows.
Character.AI Enters Agreement with Google, Announces Leadership Changes
Character.AI announces a strategic agreement with Google and key leadership changes to accelerate the development of personalized AI products.
NVIDIA NIM Microservices Enhance LLM Inference Efficiency at Scale
NVIDIA NIM microservices optimize throughput and latency for large language models, improving efficiency and user experience for AI applications.
MIT Research Unveils AI's Potential in Safeguarding Critical Infrastructure
MIT's new study reveals how large language models (LLMs) can efficiently detect anomalies in critical infrastructure systems, offering a plug-and-play solution.
AMD Radeon PRO GPUs and ROCm Software Expand LLM Inference Capabilities
AMD's Radeon PRO GPUs and ROCm software enable small enterprises to leverage advanced AI tools, including Meta's Llama models, for various business applications.
TEAL Introduces Training-Free Activation Sparsity to Boost LLM Efficiency
TEAL offers a training-free approach to activation sparsity, significantly enhancing the efficiency of large language models (LLMs) with minimal degradation.
NVIDIA's Blackwell Platform Breaks New Records in MLPerf Inference v4.1
NVIDIA's Blackwell architecture sets new benchmarks in MLPerf Inference v4.1, showcasing significant performance improvements in LLM inference.
LangGraph.js v0.2 Enhances JavaScript Agents with Cloud and Studio Support
LangChain releases LangGraph.js v0.2 with new features for building and deploying JavaScript agents, including support for LangGraph Cloud and LangGraph Studio.
NVIDIA GH200 NVL32: Revolutionizing Time-to-First-Token Performance with NVLink Switch
NVIDIA's GH200 NVL32 system shows significant improvements in time-to-first-token performance for large language models, enhancing real-time AI applications.
Ollama Enables Local Running of Llama 3.2 on AMD GPUs
Ollama makes it easier to run Meta's Llama 3.2 model locally on AMD GPUs, offering support for both Linux and Windows systems.
Innovative LoLCATs Method Enhances LLM Efficiency and Quality
Together.ai introduces LoLCATs, a novel approach for linearizing LLMs, enhancing efficiency and quality. This method promises significant improvements in AI model development.