NII’s Game-Changing Japanese LLM: llm-jp-3-172b-instruct3

Japan's National Institute of Informatics unveils llm-jp-3-172b-instruct3, a pioneering Japanese large language model with 172 billion parameters. Surpassing GPT-3.5, it sets new standards for AI transparency and collaboration.

I’ve always been fascinated by the leaps in AI research, especially when it feels close to home. The National Institute of Informatics (NII) has just released something extraordinary: llm-jp-3-172b-instruct3, a Japanese large language model with 172 billion parameters.

Yes, you read that right—this model not only surpasses GPT-3.5 but also sets a new benchmark for transparency and innovation in AI.


A futuristic illustration depicting a glowing digital brain interconnected with intricate circuits, symbolizing the advancement of AI technology. The design incorporates a sleek, high-tech aesthetic with subtle Japanese cultural elements.

About the National Institute of Informatics (or NII)

The National Institute of Informatics or NII (国立情報学研究所 in Japanese) has been a significant contributor to the development of Large Language Models (LLMs) in recent years. Here are some key contributions:

  1. In May 2023, NII established the LLM Research Group (LLM-jp), which includes participants from various research institutes and private enterprises.

  2. In October 2023, the LLM-jp group developed and released their first LLM with 13 billion parameters, which was made fully open to researchers along with its corpus data, development processes, and technical documents

  3. NII has been actively working on more advanced models. As of April 2024, they were developing an LLM with 175 billion parameters, aiming to be equivalent to GPT-3 level, with a goal of completion around summer 2024.

  4. On April 1, 2024, NII established the Research and Development Center for Large Language Models (LLMC) to accelerate R&D efforts in developing domestic LLMs and ensuring transparency and reliability of generative AI models

These initiatives demonstrate NII's commitment to advancing LLM research and development in Japan, with a focus on openness, transparency, and collaboration within the research community.


Why This Release Is So Important

For years, I’ve admired how Japan excels in balancing technological innovation with community-centered principles. With this model, NII isn’t just flexing its technical muscles; it’s making a bold statement about openness in AI. By releasing the data, tools, and detailed documentation behind llm-jp-3-172b, NII is fostering a collaborative spirit that I believe should be a standard in the AI world.

This is especially significant because large language models often feel like they’re locked behind corporate walls. Having an open model that was developed with 2.1 trillion tokens and fine-tuned with a Japanese-first approach isn’t just refreshing—it’s empowering.

A Few Highlights

  1. Unmatched Performance
    This model is built on Llama 2 architecture and trained on an impressive dataset. It even scored 0.023 points higher than GPT-3.5 on the Japanese-specific "llm-jp-eval" benchmark. If you’re working on Japanese NLP tasks, this kind of precision is a game-changer.

  2. Versatility in Applications
    What excites me most are the potential applications. Here are a few scenarios I’ve been imagining:

  3. Running sentiment analysis on vast Japanese social media datasets. Imagine understanding cultural trends at an unprecedented scale.
  4. Summarizing dense legal or medical documents. If you’ve ever faced a wall of kanji in a contract or medical report, you’ll know why this is a big deal.
  5. Building smarter customer support systems. I can already see how Japanese companies could create AI that understands and responds with cultural nuance.

  6. Collaborative Development
    This isn’t a one-institute show. Over 1,900 researchers contributed to this model under the GENIAC project, supported by Japan’s METI and NEDO. It’s a true testament to what can be achieved when academia, industry, and government join forces.

My Thoughts

This model feels like a turning point, especially for those of us who’ve always felt a slight disconnect with mainstream AI developments. So much of what we use today—GPT, BERT, you name it—comes from outside Japan. While they’re fantastic tools, there’s something uniquely empowering about seeing a world-class model designed for Japanese language tasks.

LLMs are the foundation of the AI that we now know. This follows through on the need of every country for a "Sovereign AI" as mentioned by NVIDIA's Jensen Huang in Feb 2024. I also mentioned this in my previous blog post titled Competing in the Realm of AI: Japan's Imperative that touches on the need for a home-made LLM and leveraging on the manufacturing expertise for chips.

But beyond that, I think NII’s focus on transparency and trustworthiness is something we all should pay attention to. In an age where AI often feels like a black box, knowing how a model was trained, on what data, and for what purpose, makes me optimistic about the direction we’re heading.

I’m genuinely curious to see how this model evolves. With NII’s open approach, I wouldn’t be surprised if llm-jp-3-172b sparks innovations we haven’t even imagined yet. It’s a win not just for Japan but for anyone passionate about pushing the boundaries of AI while staying grounded in principles that matter.

kafkai logo