We have released Kafkai Ver.3. If you're using the previous version of Kafkai, you can log in from here. You can read more about this new release here.

DeepSeek R1 Revolutionizes AI - Open Model Lowers Costs

DeepSeek’s R1 launched on January 20, 2025, shaking up AI with 95% lower training costs than OpenAI’s o1. An open model, it challenges industry norms and even sent NVIDIA’s stock tumbling. Here’s my take.

The AI world is still reeling from the unexpected release of DeepSeek's R1 model on January 20, 2025. This Chinese startup, founded in July 2023, has achieved something remarkable. R1 has shaken not just the AI research community but also markets and assumptions across the board.

So many people has written online about it from so many angles and I thought there is nothing more I can add, but the temptation is just to great to miss out on this boat so I sat down and just wrote my thoughts in bullet points below. At least a year after this date, we can all come back to this post and see how so many things have changed.

Here’s a breakdown of what’s happening, why R1 is different, and what it means for the future of AI.


Two chinese men in matching traditional dress and is not AI generated

Two Chinese Men In Matching Traditional Dress. Nothing to do with AI other than the fact that this image was not AI generated

First Encounter: How I Heard About R1

  • Release Date: R1 officially launched on January 20, 2025.
  • When I Found Out:
  • I first noticed evaluation results online on January 21.
  • On January 22, I asked my team to conduct our own evaluation.
  • Initial Impressions: R1 isn’t groundbreaking in terms of what it can do compared to OpenAI’s state-of-the-art publicly accessible o1, but it’s revolutionary in cost efficiency and accessibility.

How Is R1 Different?

Cost Efficiency

  • Training Costs: DeepSeek has achieved 95% lower training costs compared to OpenAI’s o1.
    • According to this VentureBeat article, this achievement is a game-changer.
    • AI model development is no longer limited to those with massive budgets, i.e not only centered in Silicon Valley.

Open Model

  • Accessibility: R1 is an open model, meaning:
    • You can download it.
    • You can fine-tune it on your own hardware and data.
    • You can’t do this with OpenAI’s o1.
  • Why This Matters: While it’s not fully "open-source" (the training data is not available, which means we won't know the biases or restrictions of the model. More on that below), this level of access allows smaller players to innovate and customize the model.
  • Open Alternatives: It's worth noting that there are other open models that we can choose from, such as Meta's Llama-3 and Mistral 7B

Technical Innovations

  • Smarter Parameter Use:
    • R1 has 671 billion parameters in total.
    • Only 37 billion are actively used for training, making it far more efficient. On the other hand, the drawback for this is that the model might be less customizable because only a portion of the model is being utilized.
  • GPU Usage:

Why Is R1 So Inexpensive?

  • Training Method:
    • DeepSeek uses "pure reinforcement learning."
    • This is like learning to ride a bike without manuals—just trial, error, and experience.
    • It reduces dependency on expensive, curated supervised data.
  • Architecture Choices:
    • Fewer parameters actively trained = lower costs.
    • Uses advanced quantization techniques. "Quantization" here means the precision of the training. R1 uses a less precise but balanced quantization compared to GPt-4's more conventional (maybe 16-bit or 32-bit?) quantization. Lower precision reduces model size and requires less effort to train on the same data.
    • Efficient use of GPUs without sacrificing model performance.

What Did R1 Do to the Market?

  • NVIDIA’s Shock:
    • On the day R1 was announced, NVIDIA's stock plummeted 17%.
    • This erased $600 billion in market capitalization—the largest single-day loss for a U.S. company in history. By the way, $600 billion is around the same amount that Softbank, OpenAI and others are planning to spend on the "StarGate" project in the next 4 years.
    • Why? R1 shattered the assumption that AI needs premium GPUs in huge quantities to achieve state-of-the-art performance.
  • Broader Market Impact:
    • The tech sector collectively felt the shock, with shares declining by 3% across the board.
    • Investors are now questioning the return on their $1 trillion investment in AI-related ventures.

What Does This Mean for AI’s Future?

Advantages

  • Cheaper AI Development: R1 has shown that cost-efficient AI models are possible.
  • Better Accessibility: Open models like R1 democratize AI innovation.
  • More Competition: This could drive pricing down for foundational AI models, benefiting businesses and consumers.

Concerns

  • Bias Risks:
    • Some have raised concerns about R1’s potential biases due to its China-based training.
    • However, biases are a known issue across all foundational models, including non-China ones.
  • Investor Skepticism:
    • With lower costs demonstrated by R1, will investors reconsider large-scale AI funding?

My Thoughts

Positives

  • More Competition, Better Pricing:

    • Models like R1 can lead to better pricing for applications and platforms like Kafkai.
    • Lower entry barriers will encourage more innovation in the AI ecosystem.
  • Open Models Are the Future:

    • R1’s openness is a step toward a more collaborative AI industry.
    • Fine-tuning capabilities on local hardware allow developers greater control and innovation opportunities.
  • Reinforcement Learning as a Game-Changer:

    • DeepSeek has proven that expensive supervised datasets aren’t the only path to success.
    • This shift could lead to more sustainable and scalable AI development practices.

Concerns

  • Industry Uncertainty:

    • How will companies respond to the revelation that they can achieve AI excellence without heavy investments in GPUs?
    • Will the $1 trillion poured into AI still seem justified?
  • Bias Debates Will Continue:

    • While biases in models are not new, R1’s China-based origin will likely keep it under scrutiny.
    • It’s crucial for users to address these biases when fine-tuning models.

What’s Next?

  • For Kafkai:
    • We’re eager to integrate R1 into our pipeline and explore its capabilities firsthand.
  • For the Industry:
    • R1 has sparked a necessary conversation about efficiency, accessibility, and the future of AI.
  • For Developers:
    • Open models like R1 are paving the way for more decentralized, innovative AI development.

The launch of DeepSeek’s R1 marks a pivotal moment in AI history. It’s not just about what R1 can do—it’s about what it represents: a challenge to the status quo and a glimpse into a future where AI is more accessible, affordable, and open.

What do you think? Are we ready for this seismic shift in AI? 🤔

Further Reading

Readers of this post might also find these articles interesting:


Thank you to Cheuk Ting Ho for helping me review this post.

kafkai logo