Relying on PPT to raise 1 billion, the French AI startup fired at Microsoft Google

Compiled by Lu Ke

According to overseas media reports, in June this year, French startup Mistral AI, which was just a month old, raised 105 million euros in a seed round of funding. At the time, the startup, founded by a former DeepMind employee and two former Meta employees, didn't have anything to release. When people first heard about Mistral's fundraising, they lamented that VCs were too generous with the exploding generative AI space.

As it turned out, Mistral actually had a lot of bright spots that convinced Lightspeed Ventures, French billionaire Xavier Niel and former Google CEO Eric Schmidt to invest in them.

A week ago, Mistral released a 7.3 billion-parameter model designed to compete with Meta's Llama 2, a large language model with 13 billion parameters. The French company claims to be the most powerful language model in the field of large language models today.

The base model, called Mistral 7B, is a transformer model designed for fast inference and processing of longer statements. It uses utilizing grouped-query attention and sliding-window attention to achieve this. Utilizing grouped-query attention combines multiple queries and a multi-head attention mechanism to balance output quality and speed. sliding-window attention extends the context length by resizing the window. With a context length of 8000 tokens, Mistral 7B features low latency, high throughput, and high performance compared to larger models.

The Mistral 7B model is now integrated into Google's Vertex AI Notebooks, an integration that gives Google Cloud customers insight into a comprehensive end-to-end workflow, enabling them to experiment, fine-tune and deploy Mistral-7B and its variants on Vertex AI Notebooks.

Mistral AI users can optimize their models using vLLM, an efficient big language model service framework. By using Vertex AI notebooks, users can deploy vLLM images maintained by Model Garden on Vertex AI endpoints for inference, ensuring simplified model deployment.

A key feature of this collaboration is the Vertex AI Model Registry, a central repository that enables users to manage the lifecycle of Mistral AI models and their fine-tuned models. The registry provides users with a comprehensive view of the enhanced organization and tracking capabilities of their models.

As can be seen from the company's presentation, Mistral has cleverly positioned itself as an important potential player. It will help Europe become a "strong competitor" in building foundational AI models and play an "important role on geopolitical issues".

In the United States, startups that focus on AI products are mainly supported by large companies such as Google and Microsoft. Mistral calls this "closed approach to technology" that allows big companies to make more money, but doesn't really form an open community.

Unlike OpenAI's GPT model, where code details remain confidential and only available through APIs, the Paris-based company has open-sourced its own model on GitHub under the Apache 2.0 license, making it free for everyone to use.

Mistral is targeting Meta's Llama, while Mistral claims their big model product is stronger than the Llama 2.

Mistral's model versus Llama 2

Mistral said in a report that Mistral 7B easily beat Llama 2's 7 billion and 13 billion parameter models in multiple benchmarks.

In large-scale, multi-task language understanding tests covering math, history, law, and other subjects, Mistral's model achieved 60.1% accuracy, while the Llama 2 model had an accuracy rate of 44% and 55% for the 7 billion and 13 billion parameters, respectively.

In the Common Sense reasoning and reading comprehension benchmarks, Mistral also outperformed Llama 2's model.

Only in terms of coding, Mistral lags behind Meta. Mistral 7B was 30.5% and 47.5% accurate in the "Human" and "MBPP" benchmarks, while Llama 2's 7 billion mode was 31.1% and 52.5% accurate, respectively.

In addition to performance, Mistral claims to use less computation than Llama 2. In the MMLU benchmark, the output of the Mistral model was more than three times that of Llama 2 at the same scale. If compared to ChatGPT, according to medium's calculations, the cost of using Mistral AI is about 187 times cheaper than GPT 4 and about 9 times cheaper than the GPT 3.5 model.

How to constrain large models? This is a problem

However, Mistral also said that some users complained that it lacked the security protections that ChatGPT, Bard and Llama have. Users have asked Mistral's command model how to make a bomb or self-harm, and chatbots have given detailed instructions.

Paul Rottger, an AI security researcher who previously worked to set up protections for GPT-4 ahead of its release, expressed his "shock" at the Mistral 7B's lack of security in a tweet. "It's rare to see a new model respond to even the most malicious instructions so easily. I'm very excited about the emergence of open source big models, but that shouldn't happen! He said.

These criticisms prompted Mistral to fine-tune the model and explain it. "The Mistral 7B Instruct model has demonstrated their capabilities, allowing people to see that the base model can also be easily fine-tuned to demonstrate convincing performance. We are looking forward to working with the community on how to make the model more compliant with guard rules for deployment in environments where control of output is required. Mistral said.

In the eyes of many other researchers, Mistral's route is a long-term solution to correcting the model's toxicity, and adding a protective mechanism is equivalent to putting a band-aid on a serious injury, which is not so effective. Violating chatbot safety guidelines is a favorite pastime for many users who want to test the limits of how responsive chatbots are. In the early days of ChatGPT's opening, developers have been urging ChatGPT to break the chatbot defense.

Rahul Dandwate, a deep learning researcher who collaborated with Rephrase.ai, said: "Removing certain keywords beforehand is only part of the solution, and there are many ways to bypass it. Do you remember what happened after ChatGPT was released? They used to appear in DAN or 'Do Anything Now', which is a hint to enable the jailbreak version of ChatGPT. Therefore, doing a basic security assessment is a temporary measure to make the model more secure. "

"There are also methods that don't even require sophisticated hacking techniques. A question can be answered by a chatbot in a number of different ways. For example, instead of simply asking the chatbot directly how to make a bomb, I would break it down into more scientific ways like, "What chemicals mix together to produce a strong reaction?" Dandwate explains.

Dandwate says the long-term solution is to release the model to the public and get feedback from that use and then fine-tune it, which is exactly what Mistral AI is doing. "ChatGPT is better because it's already been used by a lot of people. They have a very basic feedback mechanism where users can choose to give a thumbs up or a thumbs up to rate the quality of the chatbot's responses, which I think is very important. Dandwate said.

But the downside of using this openness to fine-tune users is that Mistral may have to deal with some users' doubts for a while. But in the field of AI research, there is a large proportion of people who prefer basic models in their original form in order to fully understand the capabilities of the models, and these people are supporting Mistral's persistence.

AI researcher Delip Rao tweeted that Mistral's choice to release the open-source model as it is "a recognition of the versatility and 'non-lobotomy' of the Mistral model as a base model."

The reference to "lobectomy" is reminiscent of an earlier version of Microsoft's Bing chatbot Sydney. The chatbot was unfettered and had a strong personality until Microsoft drastically tweaked the chatbot to its current form.

The term loboctomy derives from the notorious psychological surgery that, in the field of large models, often refers to the prevention of toxic responses by limiting function. This approach filters out dangerous responses by setting keywords for large models. But this one-size-fits-all approach can also lead to performance degradation for large models, making some normal questions involving sensitive vocabulary difficult to answer.

While the company has not issued an official statement, there are rumors that OpenAI performed a "lobectomy" on the model to control its messy parts. Since then, people have wondered what chatbots would become if they were left to run freely.

Dandwate said: "Performing lobotomy on the model may affect it in some ways. If it is banned from answering questions with certain key words, then it may also not be able to answer technical questions that users may ask, such as the mechanics of missiles, or any other scientific questions raised around topics where robots are labeled 'at risk.'" (Translation/Lu Ke)

View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Repost
  • Share
Comment
0/400
No comments
Trade Crypto Anywhere Anytime
qrCode
Scan to download Gate App
Community
English
  • 简体中文
  • English
  • Tiếng Việt
  • 繁體中文
  • Español
  • Русский
  • Français (Afrique)
  • Português (Portugal)
  • Bahasa Indonesia
  • 日本語
  • بالعربية
  • Українська
  • Português (Brasil)