Friday, January 31, 2025
News

DeepSeek looks fantastic but not a miracle and not built in USD 5m, panic on it seems overblown: Bernstein Report

   SocialTwist Tell-a-Friend    Print this Page   COMMENT

New Delhi | January 29, 2025 10:12:31 AM IST
As the social media platforms and the stock markets are buzzed with the popularity of the new AI company DeepSeek, a report by Bernstein stated that DeepSeek looks fantastic but not a miracle and not built in USD 5 million.

The report addressed the buzz around DeepSeek's models, particularly the idea that the company built something comparable to OpenAI for just USD 5 million. According to the report, this claim is misleading and doesn't reflect the full picture.

It stated that "we believe that DeepSeek DID NOT "build OpenAI for USD 5M"; the models look fantastic but we don't think they are miracles; and the resulting Twitter-verse panic over the weekend seems overblown".

The Bernstein report stated that DeepSeek has developed two main families of AI models: 'DeepSeek-V3' and 'DeepSeek R1'. The V3 model is a large language model that uses a Mixture-of-Experts (MOE) architecture.

This approach combines multiple smaller models to work together, resulting in high performance while using significantly fewer computing resources compared to other large models. The V3 model has 671 billion parameters in total, with 37 billion active at any given time.

It also incorporates innovative techniques like Multi-Head Latent Attention (MHLA), which reduces memory usage, and mixed-precision training using FP8 computation, which improves efficiency.

To train the V3 model, DeepSeek used a cluster of 2,048 NVIDIA H800 GPUs for about two months, totalling approximately 2.7 million GPU hours for pre-training and 2.8 million GPU hours including post-training.

While some have estimated the cost of this training at around USD 5 million based on a USD 2 per GPU hour rental rate, the report points out that this figure doesn't account for the extensive research, experimentation, and other costs involved in developing the model.

The second model, 'DeepSeek R1', builds on the V3 foundation but uses Reinforcement Learning (RL) and other techniques to significantly improve reasoning capabilities. The R1 model has been particularly impressive, performing competitively against OpenAI's models in reasoning tasks.

However, the report noted that the additional resources required to develop R1 were likely substantial, though not quantified in the company's research paper.

Despite the hype, the report emphasized that DeepSeek's models are indeed impressive. The V3 model, for instance, performs as well as or better than other large models on language, coding, and math benchmarks while using only a fraction of the computing resources.

For example, pre-training V3 required about 2.7 million GPU hours, which is just 9 per cent of the compute resources needed to train some other leading models.

In conclusion, the report outlined that while DeepSeek's achievements are remarkable, the panic and exaggerated claims about building an OpenAI competitor for USD 5 million are overblown. (ANI)

 
  LATEST COMMENTS ()
POST YOUR COMMENT
Comments Not Available
 
POST YOUR COMMENT
 
 
TRENDING TOPICS
 
 
CITY NEWS
MORE CITIES
 
 
 
MORE BUSINESS NEWS
Himadri Speciality Chemical Ltd Awarded ...
Indian Educators Shine at the Internatio...
Lord's Mark Industries Stands Out at the...
Elpro International School Organises SAA...
How Tech and AI are Transforming Loyalty...
HDFC Life and SATYA MicroCapital Ltd. Co...
More...
 
INDIA WORLD ASIA
BJP Leader Praveen Shankar Kapoor moves ...
Maha Kumbh: UP Chief Secretary meets inj...
Andhra Pradesh government launches Whats...
Punjab Police arrests two operatives and...
Religious leaders' certificate not requi...
'Parvesh Verma openly distributing cash ...
More...    
 
 Top Stories
If Kejriwal made false 'poison' sta... 
Sweden: Salwan Momika, Iraqi man be... 
'Ginny and Georgia' Season 3 to be ... 
Jenna Ortega-starrer 'Wednesday' se... 
Newlywed Keerthy Suresh shares stun... 
Jenna Ortega returns in 'Wednesday'... 
Uttarakhand CM Dhami campaigns for ... 
Delhi Polls: AAP criticizes "raids"...