DEEPSEEK

DeepSeek is an AI company and a family of large language models based in Hangzhou, China. It was founded in 2023 and funded by High-Flyer, a well - known quantitative asset management giant. DeepSeek is dedicated to developing advanced large language models and related technologies. It has released several models, including DeepSeek LLM, DeepSeek Coder, DeepSeekMath, and DeepSeek - VL. The latest version, DeepSeek - V3, which was launched in December 2024, has 67.1 billion parameters and was trained on a dataset of 14.8 trillion tokens. It uses FP8 training and open - sources the native FP8 weights. Benchmark tests show that it outperforms Llama 3.1 and Qwen 2.5 while matching GPT - 4O and Claude 3.5 Sonnet. In addition, DeepSeek - R1, which was officially released on January 20, 2025, performs on a par with OpenAI O1 in terms of mathematics, code, and natural language reasoning tasks. DeepSeek's models have a wide range of applications, such as chat and coding scenarios, multilingual automatic translation, image generation, and AI painting. With their high performance and low cost, DeepSeek's models have quickly gained popularity. For example, on February 2, 2025, the DeepSeek app climbed to the top of the download charts in 140 countries on the Apple App Store and also topped the Android Play Store in the United States

deepseek

NVIDIA NIM Microservices Revolutionize Scientific Literature Reviews

NVIDIA's NIM microservices for LLMs are transforming the process of scientific literature reviews, offering enhanced speed and accuracy in information extraction and classification.

by Jessie A Ellis
Feb 26, 2025

deepseek

Efficient Meeting Summaries with LLMs Using Python

Learn how to create detailed meeting summaries using AssemblyAI's LeMUR framework and large language models (LLMs) with just five lines of Python code.

by Alvin Lang
Feb 22, 2025

deepseek

Exploring the Impact of LLM Integration on Conversation Intelligence Platforms

Discover how integrating Large Language Models (LLMs) revolutionizes Conversation Intelligence platforms, enhancing user experience, customer understanding, and decision-making processes.

by Joerg Hiller
Jan 10, 2025

deepseek

Enhancing LLMs for Domain-Specific Multi-Turn Conversations

Explore the challenges and solutions in fine-tuning Large Language Models (LLMs) for effective domain-specific multi-turn conversations, as detailed by together.ai.

by Alvin Lang
Nov 26, 2024

deepseek

Exploring Model Merging Techniques for Large Language Models (LLMs)

Discover how model merging enhances the efficiency of large language models by repurposing resources and improving task-specific performance, according to NVIDIA's insights.

by Jessie A Ellis
Oct 29, 2024

deepseek

Innovative LoLCATs Method Enhances LLM Efficiency and Quality

Together.ai introduces LoLCATs, a novel approach for linearizing LLMs, enhancing efficiency and quality. This method promises significant improvements in AI model development.

by Ted Hisokawa
Oct 15, 2024

deepseek

Llama 3.1 405B Achieves 1.5x Throughput Boost with NVIDIA H200 GPUs and NVLink

NVIDIA's latest advancements in parallelism techniques enhance Llama 3.1 405B throughput by 1.5x, using NVIDIA H200 Tensor Core GPUs and NVLink Switch, improving AI inference performance.

by Peter Zhang
Oct 11, 2024

deepseek

NVIDIA GH200 NVL32: Revolutionizing Time-to-First-Token Performance with NVLink Switch

NVIDIA's GH200 NVL32 system shows significant improvements in time-to-first-token performance for large language models, enhancing real-time AI applications.

by Peter Zhang
Sep 27, 2024

deepseek

AI21 Labs Unveils Jamba 1.5 LLMs with Hybrid Architecture for Enhanced Reasoning

AI21 Labs introduces Jamba 1.5, a new family of large language models leveraging hybrid architecture for superior reasoning and long context handling.

by Jessie A Ellis
Aug 23, 2024

deepseek

Anyscale Explores Direct Preference Optimization Using Synthetic Data

Anyscale's latest blog post delves into Direct Preference Optimization (DPO) with synthetic data, highlighting its methodology and applications in tuning language models.

by Felix Pinkston
Aug 22, 2024

DEEPSEEK

Trending topics