
ChatGPT
Principles and Architecture
- 1st Edition - April 9, 2025
- Imprint: Elsevier
- Author: Ge Cheng
- Language: English
- Paperback ISBN:9 7 8 - 0 - 4 4 3 - 2 7 4 3 6 - 7
- eBook ISBN:9 7 8 - 0 - 4 4 3 - 2 7 4 3 7 - 4
ChatGPT: Principles and Architecture bridges the knowledge gap between theoretical AI concepts and their practical applications. It equips industry professionals and resear… Read more

Purchase options

Institutional subscription on ScienceDirect
Request a sales quoteChatGPT: Principles and Architecture bridges the knowledge gap between theoretical AI concepts and their practical applications. It equips industry professionals and researchers with a deeper understanding of large language models, enabling them to effectively leverage these technologies in their respective fields. In addition, it tackles the complexity of understanding large language models and their practical applications by demystifying underlying technologies and strategies used in developing ChatGPT and similar models. By combining theoretical knowledge with real-world examples, the book enables readers to grasp the nuances of AI technologies, thus paving the way for innovative applications and solutions in their professional domains.
Sections focus on the principles, architecture, pretraining, transfer learning, and middleware programming techniques of ChatGPT, providing a useful resource for the research and academic communities. It is ideal for the needs of industry professionals, researchers, and students in the field of AI and computer science who face daily challenges in understanding and implementing complex large language model technologies.
Sections focus on the principles, architecture, pretraining, transfer learning, and middleware programming techniques of ChatGPT, providing a useful resource for the research and academic communities. It is ideal for the needs of industry professionals, researchers, and students in the field of AI and computer science who face daily challenges in understanding and implementing complex large language model technologies.
- Offers comprehensive insights into the principles and architecture of ChatGPT, helping readers understand the intricacies of large language models
- Details large language model technologies, covering key aspects such as pretraining, transfer learning, middleware programming, and addressing technical aspects in an accessible manner
- Includes real-world examples and case studies, illustrating how large language models can be applied in various industries and professional settings
- Provides future developments and potential innovations in the field of large language models, preparing readers for upcoming changes and technological advancements
Researchers in the field of artificial intelligence (AI) and related disciplines using large language models, engineers dealing with large-scale data processing and analysis, and AI product managers, seeking an up to date, in-depth yet accessible understanding of the principles, mechanisms and architecture of large language models such as ChatGPT and their application to everyday life and work; T professionals, software developers, those working to enhance security and privacy in data management, and technical managers in industries such as healthcare, education and finance, and the governmental, administrative, legal and technology sectors where AI and machine learning are becoming increasingly relevant and who need a resource to assist in their understanding of current AI and its implementation in various professional contexts
- Title of Book
- Cover image
- Title page
- Table of Contents
- Copyright
- Preface
- Main Content of the Book
- Target Audience for This Book
- Contact the Author
- Acknowledgments
- Chapter 1. A new milestone in artificial intelligence—ChatGPT
- Abstract
- 1.1 The development history of ChatGPT
- 1.2 The capability level of ChatGPT
- 1.3 The technical evolution of large language models
- 1.4 The technology stack of large language model
- 1.5 The impact of large language models
- 1.6 The challenges of training or deploying large models
- 1.7 The limitations of large language models
- 1.8 Summary
- Chapter 2. In-depth understanding of the transformer model
- Abstract
- 2.1 Introduction to the transformer model
- 2.2 Self-attention mechanism
- 2.3 Multihead attention mechanism
- 2.4 Feedforward neural network
- 2.5 Residual connection
- 2.6 Layer normalization
- 2.7 Position encoding
- 2.8 Training and optimization
- 2.9 Summary
- Chapter 3. Generative pretraining
- Abstract
- 3.1 Introduction to generative pretraining
- 3.2 Generative pretraining model
- 3.3 The generative pretraining process
- 3.4 Supervised fine-tuning
- 3.5 Summary
- Chapter 4. Unsupervised multitask and zero-shot learning
- Abstract
- 4.1 Encoder and decoder
- 4.2 GPT-2
- 4.3 Unsupervised multitask learning
- 4.4 The relationship between multitask and zero-shot learning
- 4.5 The autoregressive generation process of GPT-2
- 4.6 Summary
- Chapter 5. Sparse attention and content-based learning
- Abstract
- 5.1 GPT-3
- 5.2 The sparse transformer
- 5.3 Meta-learning and in-context learning
- 5.4 Bayesian inference of concept distributions
- 5.5 Thought chains
- 5.6 Summary
- Chapter 6. Pretraining strategies for large language models
- Abstract
- 6.1 Pre-training datasets
- 6.2 Processing of pretraining data
- 6.3 Distributed training patterns
- 6.4 Technical approaches to distributed training
- 6.5 Examples of training strategies
- 6.6 Summary
- Chapter 7. Proximal policy optimization
- Abstract
- 7.1 Traditional policy gradient methods
- 7.2 Actor-Critic
- 7.3 Trust region policy optimization
- 7.4 Principles of the proximal policy optimization algorithm
- 7.5 Summary
- Chapter 8. Human feedback reinforcement learning
- Abstract
- 8.1 Reinforcement learning in ChatGPT
- 8.2 InstructGPT training dataset
- 8.3 Training stages of human feedback reinforcement learning
- 8.4 Reward modeling algorithms
- 8.5 PPO in InstructGPT
- 8.6 Multiturn dialogue capability
- 8.7 The necessity of human feedback reinforcement learning
- 8.8 Summary
- Chapter 9. Low-resource domain transfer of large language models
- Abstract
- 9.1 Self-instruct
- 9.2 Constitutional artificial intelligence
- 9.3 Low-rank adaptation
- 9.4 Quantization
- 9.5 SparseGPT
- 9.6 Case studies
- 9.7 Summary
- Chapter 10. Middleware
- Abstract
- 10.1 LangChain
- 10.2 AutoGPT
- 10.3 Competitors in middleware frameworks
- 10.4 Summary
- Chapter 11. The future path of large language models
- Abstract
- 11.1 The path to strong artificial intelligence
- 11.2 Data resource depletion
- 11.3 Limitations of autoregressive models
- 11.4 Embodied intelligence
- 11.5 Summary
- Index
- Edition: 1
- Published: April 9, 2025
- Imprint: Elsevier
- No. of pages: 300
- Language: English
- Paperback ISBN: 9780443274367
- eBook ISBN: 9780443274374
GC
Ge Cheng
Affiliations and expertise
Dr Cheng is the Deputy Director of the Technology Transfer Center at Xiangtan University and the Vice Dean of the JD Intelligent City and Big Data Research Institute in Xiangtan, China.Read ChatGPT on ScienceDirect