Behind ChatGPT: Chip-ORIGCHIP

ChatGPT fever has swept the world. ChatGPT (Chat Generative Pre-trained Transformer) is a dialogue AI model launched by OpenAI in December 2022. Once it was launched, it received widespread attention. Its active users reached 100 million in January 2023, making it the fastest growing consumer application in the past month.

ChatGPT mainly focuses on question and answer, but unlike other question and answer AI products, it has all the knowledge in the training center, has the ability of language generation, and can achieve personified communication, rather than just the one-question and one-answer mode of other AI products such as Tmall Genie and Xiaoai classmate. Based on the question and answer mode, ChatGPT can conduct reasoning, code writing, text creation, and so on. Such special advantages and user experience make the traffic of application scenarios increase significantly.

1.Volume: AIGC brings a new scenario+the traffic of the original scenario has increased significantly

① From the perspective of technical principles: ChatGPT is a dialogue AI model developed based on the GPT3.5 architecture. After the GPT-1/2/3 iteration, after the GPT3.5 model, it began to introduce code training and instruction fine-tuning, and added RLHF technology (human feedback reinforcement learning) to realize capability evolution. As a well-known NLP model, GPT is based on Transformer technology. With the continuous iteration of the model, the number of layers is increasing, and the demand for computing power is also increasing.

② From the perspective of operation conditions: three conditions for the perfect operation of ChatGPT: training data+model algorithm+computational power. Among them, the training data market is broad, the technical barrier is low, and it can be obtained after investing enough human, material and financial resources; The basic model and model optimization have a low demand for computational power, but to obtain the ChatGPT function requires large-scale pre-training on the basic model. The ability to store knowledge comes from 175 billion parameters, which requires a lot of computational power. Therefore, computing power is the key to ChatGPT operation.

2.Price: the demand for high-end chips will drive the average price of chips

The cost of purchasing a Nvidia top GPU is 80000 yuan, and the cost of GPU server usually exceeds 400000 yuan. For ChatGPT, supporting its computing infrastructure requires at least tens of thousands of Nvidia GPU A100, and the cost of a model training exceeds 12 million dollars.

From the perspective of chip market, the rapid increase in chip demand will further increase the average price of chips. At present, OpenAI has launched a subscription model of $20/month, and has initially built a high-quality subscription business model. The capacity to continue to expand in the future will be greatly improved.

"Hero behind" is supported by GPU or CPU+FPGA

1) GPU can support strong computing power demand. Specifically, from the perspective of AI model construction: the first stage is to build a pre-training model with super computing power and data; The second stage is targeted training on the pre-training model. However, GPU is widely used because of its parallel computing capability and compatibility with training and reasoning. At least 10000 Nvidia GPUs have been imported into the ChatGPT training model (the once popular AlphaGO only needs 8 GPUs). The reasoning part uses Microsoft's azure cloud service and also needs GPU to operate. Therefore, the hot rise of ChatGPT shows the demand for GPU.

2) CPU+FPGA will wait and see. From the perspective of deep learning, although GPU is the most suitable chip for deep learning applications, CPU and FPGA cannot be ignored. As a programmable chip, FPGA chip can be expanded for specific functions, and has a certain space to play in the second stage of AI model construction. In order to realize the deep learning function, FPGA needs to be combined with CPU and applied to the deep learning model, which can also achieve huge computational power requirements.

3) Cloud computing relies on optical modules to realize device interconnection. AI model develops to a large-scale language model led by ChatGPT, driving the improvement of data transmission and computing power. With the growth of data transmission, the demand for optical modules as the carrier of equipment interconnection in the data center increases. In addition, with the increase of computing power and energy consumption, manufacturers seek to reduce energy consumption and promote the development of low-power optical modules.

Conclusion: ChatGPT, as a new super-intelligence dialogue AI product, needs strong computing power to support both from the perspective of technical principles and operating conditions, thus driving a significant increase in scene traffic. In addition, the increase in demand for high-end chips by ChatGPT will also drive the average price of chips, and the rise in volume and price will lead to a sharp rise in demand for chips; In the face of exponential growth in computing power and data transmission demand, GPU or CPU+FPGA chip manufacturers and optical module manufacturers that can be provided will soon enter the blue ocean market.

Blog & Articles

Behind ChatGPT: chip