Introduction to Statistical Machine Learning
- 2nd Edition - September 1, 2026
- Latest edition
- Authors: Masashi Sugiyama, Takashi Ishida
- Language: English
Machine learning allows computers to learn and discern patterns without being programmed. When Statistical techniques and machine learning are combined together, they are a… Read more
Purchase options
Machine learning allows computers to learn and discern patterns without being programmed. When Statistical techniques and machine learning are combined together, they are a powerful tool for analyzing various kinds of data in many computer science/engineering areas, including image processing, speech processing, natural language processing, robot control, as well as in fundamental sciences such as biology, medicine, astronomy, physics, and materials. Introduction to Statistical Machine Learning, Second Edition provides a general introduction to machine learning that covers a wide range of topics concisely and will help readers bridge the gap between theory and practice. Parts 1 and 2 discuss the fundamental concepts of statistics and probability that are used in describing machine learning algorithms. Part 3 and Part 4 explain the two major approaches of machine learning techniques; generative methods and discriminative methods. While Parts 5 and 6 provide an in-depth look at advanced topics that play essential roles in making machine learning algorithms more useful in practice, including creating full-fledged algorithms in a range of real-world applications drawn from research areas such as image processing, speech processing, natural language processing, robot control, as well as biology, medicine, astronomy, physics, and materials. The algorithms developed in the book include Python program code to provide you with the necessary practical skills needed to accomplish a wide range of data analysis tasks. The Second Edition also includes an all-new Part 6 on on Deep Learning, including chapters on Feedforward Neural Networks, Neural Networks with Image Data, Neural Networks with Sequential Data, learning from limited data, Representation Learning, Deep Generative Modeling, and Multimodal Learning.
- Provides the necessary background material to understand machine learning such as statistics, probability, linear algebra, and calculus.
- Complete coverage of the generative approach to statistical pattern recognition and the discriminative approach to statistical machine learning.
- Includes Python program code so that readers can test the algorithms numerically and acquire both mathematical and practical skills in a wide range of data analysis tasks.
- Discusses a wide range of applications in machine learning and statistics and provides examples drawn from image processing, speech processing, natural language processing,robot control, as well as biology, medicine, astronomy, physics, and materials
Computer Science researchers, artificial intelligence researchers, and researchers and practitioners working in the fields of data science, machine learning, and optimization. The primary audience also includes data analysts, software engineers, as well as researchers and professionals across many fields where Machine Learning is being used
Part 1. Introduction
1. Statistical Machine Learning
Part 2. Statistics and Probability
2. Random Variables and Probability Distributions
3. Examples of Discrete Probability Distributions
4. Examples of Continuous Probability Distributions
5. Multidimensional Probability Distributions
6. Examples of Multidimensional Probability Distributions
7. Sum of Independent Random Variables
8. Probability Inequalities
9. Statistical Estimation
10. Hypothesis Testing
Part 3. Generative Approach to Statistical Pattern Recognition
11. Pattern Recognition via Generative Model Estimation
12. Maximum Likelihood Estimation
13. Properties of Maximum Likelihood Estimation
14. Model Selection for Maximum Likelihood Estimation
15. Maximum Likelihood Estimation for Gaussian Mixture Models
16. Nonparametric Estimation
17. Bayesian Inference
18. Analytic Approximation of Marginal Likelihood
19. Numerical Approximation of Predictive Distribution
20. Bayesian Mixture Models
Part 4. Discriminative Approach to Statistical Machine Learning
21. Learning Models
22. Least Squares Regression
23. Constrained Least Squares Regression
24. Sparse Regression
25. Robust Regression
26. Least Squares Classification
27. Support Vector Classification
28. Probabilistic Classification
29. Structured Classification
Part 5. Further Topics
30. Ensemble Learning
31. Online Learning
32. Confidence of Prediction
33. Weakly Supervised Learning
34. Transfer Learning
35. Multitask Learning
36. Linear Dimensionality Reduction
37. Nonlinear Dimensionality Reduction
38. Clustering
39. Outlier Detection
40. Change Detection
Part 6. Deep Learning
41. Feedforward Neural Networks
42. Neural Networks with Image Data
43. Neural Networks with Sequential Data
44. Learning from Limited Data
45. Representation Learning
46. Deep Generative Modelling
47. Multimodal Learning
1. Statistical Machine Learning
Part 2. Statistics and Probability
2. Random Variables and Probability Distributions
3. Examples of Discrete Probability Distributions
4. Examples of Continuous Probability Distributions
5. Multidimensional Probability Distributions
6. Examples of Multidimensional Probability Distributions
7. Sum of Independent Random Variables
8. Probability Inequalities
9. Statistical Estimation
10. Hypothesis Testing
Part 3. Generative Approach to Statistical Pattern Recognition
11. Pattern Recognition via Generative Model Estimation
12. Maximum Likelihood Estimation
13. Properties of Maximum Likelihood Estimation
14. Model Selection for Maximum Likelihood Estimation
15. Maximum Likelihood Estimation for Gaussian Mixture Models
16. Nonparametric Estimation
17. Bayesian Inference
18. Analytic Approximation of Marginal Likelihood
19. Numerical Approximation of Predictive Distribution
20. Bayesian Mixture Models
Part 4. Discriminative Approach to Statistical Machine Learning
21. Learning Models
22. Least Squares Regression
23. Constrained Least Squares Regression
24. Sparse Regression
25. Robust Regression
26. Least Squares Classification
27. Support Vector Classification
28. Probabilistic Classification
29. Structured Classification
Part 5. Further Topics
30. Ensemble Learning
31. Online Learning
32. Confidence of Prediction
33. Weakly Supervised Learning
34. Transfer Learning
35. Multitask Learning
36. Linear Dimensionality Reduction
37. Nonlinear Dimensionality Reduction
38. Clustering
39. Outlier Detection
40. Change Detection
Part 6. Deep Learning
41. Feedforward Neural Networks
42. Neural Networks with Image Data
43. Neural Networks with Sequential Data
44. Learning from Limited Data
45. Representation Learning
46. Deep Generative Modelling
47. Multimodal Learning
Review of the previous edition:
"The probabilistic and statistical background is well presented, providing the reader with a complete coverage of the generative approach to statistical pattern recognition and the discriminative approach to statistical machine learning."—Zentralblatt MATH
"The probabilistic and statistical background is well presented, providing the reader with a complete coverage of the generative approach to statistical pattern recognition and the discriminative approach to statistical machine learning."—Zentralblatt MATH
- Edition: 2
- Latest edition
- Published: September 1, 2026
- Language: English
MS
Masashi Sugiyama
Masashi Sugiyama received the degrees of Bachelor of Engineering, Master of Engineering, and Doctor of Engineering in Computer Science from Tokyo Institute of Technology, Japan in 1997, 1999, and 2001, respectively. In 2001, he was appointed Assistant Professor in the same institute, and he was promoted to Associate Professor in 2003. He moved to the University of Tokyo as Professor in 2014. He received an Alexander von Humboldt Foundation Research Fellowship and researched at Fraunhofer Institute, Berlin, Germany, from 2003 to 2004. In 2006, he received a European Commission Program Erasmus Mundus Scholarship and researched at the University of Edinburgh, Edinburgh, UK. He received the Faculty Award from IBM in 2007 for his contribution to machine learning under non-stationarity, the Nagao Special Researcher Award from the Information Processing Society of Japan in 2011 and the Young Scientists' Prize from the Commendation for Science and Technology by the Minister of Education, Culture, Sports, Science and Technology Japan for his contribution to the density-ratio paradigm of machine learning. His research interests include theories and algorithms of machine learning and data mining, and a wide range of applications such as signal processing, image processing, and robot control.
Affiliations and expertise
Professor, The University of Tokyo, JapanTI
Takashi Ishida
Dr. Takashi Ishida is a Lecturer at Department of Complexity Science and Engineering, Graduate School of Frontier Sciences, The University of Tokyo. He is also affiliated with Department of Computer Science, Graduate School of Information Science and Technology and Department of Information Science, Faculty of Science. Dr. Ishida received his PhD from the University of Tokyo in 2021, advised by Prof. Masashi Sugiyama. Prior to that, he received the MSc from the University of Tokyo in September 2017 and the Bachelor of Economics from Keio University in March 2013.
Affiliations and expertise
Lecturer, Department of Complexity Science and Engineering, Graduate School of Frontier Sciences, The University of Tokyo, Japan