Baichuan VS Zhipu, who is China's OpenAI?

Question

*Article source: Light Cone Intelligence**Text: Hao Xin**Editor: Liu Yuqi*In early June, foreign media issued a torture of "who is China's OpenAI", and after experiencing the wave of large-model entrepreneurship, the big waves rushed to the sand, and only a few people were left in the end.The Xaar Building a few intersections outside Tsinghua University is the Baichuan Intelligence of star entrepreneur Wang Xiaochuan, and the Sohu Network Building is the wisdom spectrum AI sent by the academy. After going through the test of the market, they became the two most promising candidates.The battle for the two buildings seems to have quietly begun.** From the perspective of financing, Zhipu AI and Baichuan Intelligent have completed multiple rounds of large-scale financing this year. **![](https://img-cdn.gateio.im/resized-social/moments-bab2147faf-7e2e2823fa-dd1a6f-69ad2a) (Light Cone Intelligent Mapping: Organized according to public information)This year, the cumulative total financing amount of Zhipu AI exceeded 2.5 billion yuan, and the total financing amount of Baichuan Intelligent reached 350 million US dollars (about 2.3 billion yuan). According to public information, the latest valuation of **Zhipu AI has exceeded 10 billion yuan, the highest or 15 billion, which is one of the fastest domestic companies with a valuation of more than 10 billion yuan; **After the latest round of financing, Baichuan Intelligent has been valued at more than 1 billion US dollars (about 6.6 billion yuan).From the perspective of team composition, Zhipu AI and Baichuan Intelligent Master go out of the same house, and Wang Shaolan, president of Zhipu AI, and Wang Xiaochuan, founder of Sogou, are both entrepreneurial teams of Tsinghua Department.**From the perspective of the speed of technological catch-up, the two are also indistinguishable. **Zhipu AI's GLM-130B defeated GPT-3 as soon as it came out, and the newly released Baichuan 2 is ahead of Llama 2 in all dimensions, pioneering the development of China's open source ecosystem.All indications show that Zhipu AI and Baichuan Intelligence have become the "dark horses" of China's big model track, and under the fierce competition, who is the deer dead?  ## **Believers in OpenAI: Wisdom AI**  The relationship between Zhipu AI and OpenAI can be traced back to 2020, which was regarded by Zhang Peng, CEO of Zhipu AI, as the real "first year of AI big language model" in his heart.The relationship between Zhipu AI and OpenAI can be traced back to 2020, which was regarded by Zhang Peng, CEO of Zhipu AI, as the real "first year of AI big language model" in his heart.On the anniversary of Zhipu AI, in the joyful atmosphere, you can smell some anxiety brought about by the birth of GPT-3 from time to time. GPT-3, which reaches 175 billion parameters, is the first large-language model in the strict sense.At that time, Zhang Peng was not only shocked by the emergence ability of GPT-3, but also fell into the thinking of "whether to follow", whether it was then or now, all in the direction of ultra-large-scale parameter large model is an extremely risky thing. After weighing up, Zhipu AI decided to take OpenAI as its benchmark and invest in the development of ultra-large-scale pre-training models.![](https://img-cdn.gateio.im/resized-social/moments-bab2147faf-e41f17dbae-dd1a6f-69ad2a) (Light Cone Intelligent Mapping: Organized according to public information)**In terms of technology path selection, Zhipu AI has the same independent thinking as OpenAI. **At that time, there were several large model pre-training frameworks such as BERT, GPT and T5. The three paths have their own advantages and disadvantages in terms of training target, model structure, training data source, and model size.If the large model training process is compared to an English exam, BERT is good at doing questions through the relationship between words and sentences, and taking the exam through comprehension, and its review materials mainly come from textbooks and Wikipedia; GPT is good at predicting the next word to do questions, preparing for the exam through a lot of writing practice, and its review materials mainly come from a variety of web pages; T5 adopts a strategy of formalizing the questions, first translating the questions into Chinese and then solving the questions, and when reviewing, not only reading the textbook, but also brushing a large number of question banks.As we all know, Google chose BERT, OpenAI chose GPT, and Zhipu AI did not blindly follow, ** based on these two routes proposed GLM (General Language Model) algorithm framework. The framework actually realizes the complementary advantages and disadvantages of BERT and GPT, "which can be understood while continuing and filling in the blanks". **GLM has thus become the biggest confidence for Zhipu AI to pursue OpenAI, and under this framework, GLM series models such as GLM-130B, ChatGLM-6B, and ChatGLM2-6B have been successively grown. Experimental data show that GLM series large models are superior to GPT in terms of language understanding accuracy, inference speed, memory proportion and large model adaptation application.![](https://img-cdn.gateio.im/resized-social/moments-bab2147faf-dc70c8e5d7-dd1a6f-69ad2a) (Source: Internet)OpenAI is currently the most complete institution that can provide basic model services abroad, and its commercialization is mainly divided into two categories, one is API collection fees, and the other is ChatGPT subscription fees. In terms of commercialization, Zhipu AI also follows the general idea and is in the echelon of enterprises with relatively mature commercialization of domestic large models.According to the optical cone intelligent combing, combined with the landing of Chinese enterprises, the business model of **Zhipu AI is divided into API collection fee and privatization fee mode. **The overall types of models provided are language large models, super-anthropomorphic large models, vector large models and code large models, and under each large model option, including standard pricing, cloud private pricing and local private pricing. Compared with OpenAI, Zhapu AI lacks the provision of voice and image large model services, but adds super-anthropomorphic large models, which also caters to the needs of China's digital human, intelligent NPC and other industries.![](https://img-cdn.gateio.im/resized-social/moments-bab2147faf-d5dc94c153-dd1a6f-69ad2a) (Light Cone Intelligent Mapping: Organized according to public information)Light Cone Intelligence learned from the developer that "at present, the characteristics of Baidu Wenxin Qianfan platform are perfect, the characteristics of Tongyi Qianwen are flexible, and Zhipu AI is one of the companies with the cheapest API fees among mainstream manufacturers in the market."The fee for ChaGLM-Pro is 0.01 yuan/thousand tokens, and 18 yuan is given away, and the ChaGLM-Lite fee is reduced to 0.002 yuan/thousand tokens. For reference, OpenAI GPT-3.5 charges 0.014 yuan/thousand tokens, Ali Tongyi Qianwen-turbo charges 0.012 yuan/thousand tokens, and Baidu Wenxin One Word emie-bot-turbo charges 0.008 yuan/thousand tokens.As Zhang Peng said, Zhipu AI is also going through a new stage of "no longer following OpenAI" with OpenAI as the goal.In terms of product business, unlike OpenAI, which only focuses on the upgrade and construction of ChatGPT, Zhipu AI has chosen to attack on three sides.According to its official website, the current business of Zhipu AI is mainly divided into three blocks, namely the large-model MaaS platform, the AMiner technology intelligence platform and the cognitive digital human. As a result, three major AI product matrices have been formed, large model products, AMiner products and digital human products. Among them, large model products not only cover basic dialogue robots, but also programming, writing, painting pendant robot division.![](https://img-cdn.gateio.im/resized-social/moments-bab2147faf-0bf9c101e1-dd1a6f-69ad2a) (Source: Zhipu AI official website)At the same time, Zhipu AI continues to explore the application side through investment. Up to now, Zhipu AI has invested in Lingxin Intelligence and Painting Wall Intelligence, and increased its holdings of Lingxin Intelligent again in September this year.Lingxin Intelligence is also incubated from the Department of Computer Science of Tsinghua University, although the department is homologous, but Lingxin Intelligence is more inclined to application, and the AiU interest interactive community developed by it is a super-anthropomorphic large model based on Zhipu AI. The development idea of its products is similar to foreign Character AI, by creating AI characters with different personalities and characters, interacting and chatting with them, it is more inclined to C-end applications and emphasizes the attributes of entertainment.  ## ** Moving from OpenAI to LIama: Baichuan Intelligence**  Light Cone Intelligence found that compared to OpenAI, Baichuan Intelligence is more like Llama.**First of all, on the basis of the original technology and experience, the release and iteration speed is very fast. **Half a year after its establishment, Baichuan Intelligent has successively released four open source commercial models of baichuan-7B/13B, Baichuan2-7B/13B and two closed-source large models of Baichuan-53B and Baichuan2-53B. As of the opening of the Baichuan2-53B API interface on September 25, in the past 168 days, Baichuan Intelligent has released a large model at an average rate of months.![](https://img-cdn.gateio.im/resized-social/moments-bab2147faf-e27cb1a5a0-dd1a6f-69ad2a) (Light Cone Intelligent Mapping: Organized according to public information)Meta relies on LLama2 to win back the AI position, and Baichuan Intelligent is famous for defeating LLama2 with the Baichuan2 series of open source models.According to the test results, Baichuan2-7B-Base and Baichuan2-13B-Base are superior to LLaMA2 in several authoritative evaluation benchmarks such as MMLU, CMMLU, GSM8K, etc., and their performance is also very bright compared with other models with large amounts of the same parameters, and their performance is significantly better than that of competitors of LLaMA2 and other models of the same size.Facts have proved that the Baichuan intelligent large model has indeed stood the test. According to official data, Baichuan has been downloaded more than 5 million times in the open source community and more than 3 million times a month.Light Cone Intelligent found that the Baichuan intelligent series model has the highest number of downloads in the Hugging Face open source community of more than 110,000, which is still competitive among Chinese and foreign open source models.![](https://img-cdn.gateio.im/resized-social/moments-bab2147faf-d6975a90e4-dd1a6f-69ad2a) (Source: Hugging Face official website)The reason why its open source has advantages is also related to its strong compatibility, Baichuan Intelligent has introduced in public that its entire large model base structure is closer to the structure of Meta's LLAMA, so it is very friendly to enterprises and manufacturers from the open source design.**"After open source, the ecology will be built around LLaMA, and there are many open source projects in foreign countries that follow LLaMA to promote, which is why our structure is closer to LLaMA." Wang Xiaochuan said.According to the optical cone intelligence, Baichuan Intelligent adopts hot-pluggable in the architecture design, which can support the random switching between different modules of the Baichuan model and LLAMA model and Baichuan model, such as training a model with LLAMA, without modification, the model can be directly put into Baichuan for use. This also explains why most Internet manufacturers now use the Baichuan model, and cloud vendors introduce the Baichuan series model.The road that history has traveled leads to both the past and the future, and Wang Xiaochuan's large-model entrepreneurship is like this.Based on the identity of the founder of Sogou and search technology experience, in the early days of entrepreneurship, Wang Xiaochuan received many people's evaluations, "Xiaochuan, is the most suitable for big models."** Building large models in search experience and frameworks has become the background color of Baichuan Intelligence. **Chen Weipeng, a co-founder of Baichuan Intelligent Technology, once said that search R&D has many similarities with large model development, "Baichuan Intelligent quickly transfers the search experience to the research and development of large models, which is similar to a 'rocket-building' systematic project, dismantling complex systems, promoting team collaboration and improving team effectiveness through process evaluation."Wang Xiaochuan also talked at the press conference: "Because Baichuan Intelligence has a search gene before, it naturally knows how to select the best pages from the middle of trillions of web pages, which can be deduplicated and anti-garbage." In data processing, Baichuan Intelligent also draws on the experience of previous searches, and can complete the cleaning and deduplication of hundreds of billions of data in an hour."The core of its large model search is vividly displayed in the Baichuan-53B. In dealing with the problem of "illusion" of large models, combined with the precipitation of search technology, Baichuan Intelligent has made optimizations in information acquisition, data quality improvement, and search enhancement.![](https://img-cdn.gateio.im/resized-social/moments-bab2147faf-e8c8a50de0-dd1a6f-69ad2a) In terms of improving data quality, the core idea of Baichuan Intelligent is to "always take the best", classify data with low quality and high quality as the standard, and ensure that Baichuan2-53B always uses high-quality data for pre-training; In terms of information acquisition, Baichuan2-53B has upgraded multiple modules, including key components such as instruction intent understanding, intelligent search and result enhancement, through in-depth understanding of user instructions, accurately drive the search of query terms, and finally combine large language model technology to optimize the reliability of model result generation.Although it started with open source, Baichuan Intelligent has begun to explore the path of commercialization. According to official information, Baichuan Intelligence's goal is to "build the best large model base in China", and the goal of the vertical dimension is to enhance in search, multi-modality, education, medical and other fields.Today's commercialization is concentrated in Baichuan2-53B, and the official website shows that the model's API call adopts a time-based charging standard. 0:00-8:00 charges 0.01 yuan/thousand tokens, 8:00-24:00 charges 0.02 yuan/thousand tokens, in comparison, the daytime fee price is higher than the night.![](https://img-cdn.gateio.im/resized-social/moments-bab2147faf-4543fd2d4b-dd1a6f-69ad2a) (Source: Baichuan Intelligent official website)  ## **End**  Debating who is China's OpenAI doesn't make much sense in the early days of big model development. Many startups such as Zhipu AI and Baichuan Intelligent have realized that blindly following the footsteps of OpenAI is not advisable, for example, Zhipu AI has clarified the technical path of "not doing Chinese GPT". Moreover, at a time when open source is becoming popular and forming a siege, OpenAI's absolute technological superiority does not seem to be unbreakable.Zhipu AI, Baichuan Intelligent has mentioned that super applications are a broader market, but also the comfort zone of China's large model enterprises, no longer stay in place, for example, a person close to Zhipu AI once broke the news to the media, Zhipu AI team has firmly determined the 2B route, aiming at the information and innovation market, and in 5 months, rapidly expanded the team, from 200 to 500 people, for the subsequent 2B business reserve manpower.In the commercialization path, Baichuan Intelligent has chosen to refer to the open source ecology of Llama2, and has also begun to iterate in small steps.It can be seen with the naked eye that in only half a year, Baichuan Intelligent and Zhipu AI have gone through the technology no-man's land and come to the commercialization stage for industrial landing. Compared with the entrepreneurial boom of AI 1.0, the technology polishing period is as long as 3 years (2016-2019), and it is precisely because of the hindrance in commercial landing that a large number of AI companies will collectively decline in 2022 and fall before dawn.Learning from the lessons of the previous stage, but also because the versatility of large model technology is more convenient to land, startups represented by Baichuan Intelligence and Zhipu AI are raising troops and horses to prepare technology, products and talent reserves for the next stage.However, the first gunshots were only heard in the marathon, and it was too early to say that the outcome was too early. But at least the first stage of the track has been decomposed, and after the goal is clear, the competition is even more patient and perseverance. This is the same for Baichuan Intelligence, Zhipu AI or OpenAI.