Holiday book sale: Save up to 30% on print and eBooks. No promo code needed.
Save up to 30% on print and eBooks.
Machine Learning
A Bayesian and Optimization Perspective
1st Edition - March 27, 2015
Author: Sergios Theodoridis
Hardback ISBN:9780128015223
9 7 8 - 0 - 1 2 - 8 0 1 5 2 2 - 3
eBook ISBN:9780128017227
9 7 8 - 0 - 1 2 - 8 0 1 7 2 2 - 7
This tutorial text gives a unifying perspective on machine learning by covering both probabilistic and deterministic approaches -which are based on optimization techniques –… Read more
Purchase options
LIMITED OFFER
Save 50% on book bundles
Immediately download your ebook while waiting for your print delivery. No promo code is needed.
This tutorial text gives a unifying perspective on machine learning by covering both probabilistic and deterministic approaches -which are based on optimization techniques – together with the Bayesian inference approach, whose essence lies in the use of a hierarchy of probabilistic models.The book presents the major machine learning methods as they have been developed in different disciplines, such as statistics, statistical and adaptive signal processing and computer science. Focusing on the physical reasoning behind the mathematics, all the various methods and techniques are explained in depth, supported by examples and problems, giving an invaluable resource to the student and researcher for understanding and applying machine learning concepts.
The book builds carefully from the basic classical methods to the most recent trends, with chapters written to be as self-contained as possible, making the text suitable for different courses: pattern recognition, statistical/adaptive signal processing, statistical/Bayesian learning, as well as short courses on sparse modeling, deep learning, and probabilistic graphical models.
All major classical techniques: Mean/Least-Squares regression and filtering, Kalman filtering, stochastic approximation and online learning, Bayesian classification, decision trees, logistic regression and boosting methods.
The latest trends: Sparsity, convex analysis and optimization, online distributed algorithms, learning in RKH spaces, Bayesian inference, graphical and hidden Markov models, particle filtering, deep learning, dictionary learning and latent variables modeling.
Case studies - protein folding prediction, optical character recognition, text authorship identification, fMRI data analysis, change point detection, hyperspectral image unmixing, target localization, channel equalization and echo cancellation, show how the theory can be applied.
MATLAB code for all the main algorithms are available on an accompanying website, enabling the reader to experiment with the code.
University Researchers, R&D engineers, graduate students taking a machine learning course.
Preface
Acknowledgments
Notation
Dedication
Chapter 1: Introduction
Abstract
1.1 What Machine Learning is About
1.2 Structure and a Road Map of the Book
Chapter 2: Probability and Stochastic Processes
Abstract
2.1 Introduction
2.2 Probability and Random Variables
2.3 Examples of Distributions
2.4 Stochastic Processes
2.5 Information Theory
2.6 Stochastic Convergence
Problems
Chapter 3: Learning in Parametric Modeling: Basic Concepts and Directions
Abstract
3.1 Introduction
3.2 Parameter Estimation: The Deterministic Point of View
3.3 Linear Regression
3.4 Classification
3.5 Biased Versus Unbiased Estimation
3.6 The Cramér-Rao Lower Bound
3.7 Sufficient Statistic
3.8 Regularization
3.9 The Bias-Variance Dilemma
3.10 Maximum Likelihood Method
3.11 Bayesian Inference
3.12 Curse of Dimensionality
3.13 Validation
3.14 Expected and Empirical Loss Functions
3.15 Nonparametric Modeling and Estimation
Problems
Chapter 4: Mean-Square Error Linear Estimation
Abstract
4.1 Introduction
4.2 Mean-Square Error Linear Estimation: The Normal Equations
Chapter 5: Stochastic Gradient Descent: The LMS Algorithm and its Family
Abstract
5.1 Introduction
5.2 The Steepest Descent Method
5.3 Application to the Mean-Square Error Cost Function
5.4 Stochastic Approximation
5.5 The Least-Mean-Squares Adaptive Algorithm
5.6 The Affine Projection Algorithm
5.7 The Complex-Valued Case
5.8 Relatives of the LMS
5.9 Simulation Examples
5.10 Adaptive Decision Feedback Equalization
5.11 The Linearly Constrained LMS
5.12 Tracking Performance of the LMS in Nonstationary Environments
5.13 Distributed Learning: The Distributed LMS
5.14 A Case Study: Target Localization
5.15 Some Concluding Remarks: Consensus Matrix
Problems
MATLAB Exercises
Chapter 6: The Least-Squares Family
Abstract
6.1 Introduction
6.2 Least-Squares Linear Regression: A Geometric Perspective
6.3 Statistical Properties of the LS Estimator
6.4 Orthogonalizing the Column Space of X: The SVD Method
6.5 Ridge Regression
6.6 The Recursive Least-Squares Algorithm
6.7 Newton’s Iterative Minimization Method
6.8 Steady-State Performance of the RLS
6.9 Complex-Valued Data: The Widely Linear RLS
6.10 Computational Aspects of the LS Solution
6.11 The Coordinate and Cyclic Coordinate Descent Methods
6.12 Simulation Examples
6.13 Total-Least-Squares
Problems
Chapter 7: Classification: A Tour of the Classics
Abstract
7.1 Introduction
7.2 Bayesian Classification
7.3 Decision (Hyper)Surfaces
7.4 The Naive Bayes Classifier
7.5 The Nearest Neighbor Rule
7.6 Logistic Regression
7.7 Fisher’s Linear Discriminant
7.8 Classification Trees
7.9 Combining Classifiers
7.10 The Boosting Approach
7.11 Boosting Trees
7.12 A Case Study: Protein Folding Prediction
Problems
Chapter 8: Parameter Learning: A Convex Analytic Path
Abstract
8.1 Introduction
8.2 Convex Sets and Functions
8.3 Projections onto Convex Sets
8.4 Fundamental Theorem of Projections onto Convex Sets
8.5 A Parallel Version of POCS
8.6 From Convex Sets to Parameter Estimation and Machine Learning
8.7 Infinite Many Closed Convex Sets: The Online Learning Case
8.8 Constrained Learning
8.9 The Distributed APSM
8.10 Optimizing Nonsmooth Convex Cost Functions
8.11 Regret Analysis
8.12 Online Learning and Big Data Applications: A Discussion
8.13 Proximal Operators
8.14 Proximal Splitting Methods for Optimization
Problems
MATLAB Exercises
8.15 Appendix to Chapter 8
Chapter 9: Sparsity-Aware Learning: Concepts and Theoretical Foundations
Abstract
9.1 Introduction
9.2 Searching for a Norm
9.3 The Least Absolute Shrinkage and Selection Operator (LASSO)
9.4 Sparse Signal Representation
9.5 In Search of the Sparsest Solution
9.6 Uniqueness of the ℓ0 Minimizer
9.7 Equivalence of ℓ0 and ℓ1 Minimizers: Sufficiency Conditions
9.8 Robust Sparse Signal Recovery from Noisy Measurements
9.9 Compressed Sensing: The Glory of Randomness
9.10 A Case Study: Image De-Noising
Problems
Chapter 10: Sparsity-Aware Learning: Algorithms and Applications
Abstract
10.1 Introduction
10.2 Sparsity-Promoting Algorithms
10.3 Variations on the Sparsity-Aware Theme
10.4 Online Sparsity-Promoting Algorithms
10.5 Learning Sparse Analysis Models
10.6 A Case Study: Time-Frequency Analysis
10.7 Appendix to Chapter 10: Some Hints from the Theory of Frames
Problems
Chapter 11: Learning in Reproducing Kernel Hilbert Spaces
Abstract
11.1 Introduction
11.2 Generalized Linear Models
11.3 Volterra, Wiener, and Hammerstein Models
11.4 Cover’s Theorem: Capacity of a Space in Linear Dichotomies
11.5 Reproducing Kernel Hilbert Spaces
11.6 Representer Theorem
11.7 Kernel Ridge Regression
11.8 Support Vector Regression
11.9 Kernel Ridge Regression Revisited
11.10 Optimal Margin Classification: Support Vector Machines
Chapter 16: Probabilistic Graphical Models: Part II
Abstract
16.1 Introduction
16.2 Triangulated Graphs and Junction Trees
16.3 Approximate Inference Methods
16.4 Dynamic Graphical Models
16.5 Hidden Markov Models
16.6 Beyond HMMs: A Discussion
16.7 Learning Graphical Models
Problems
Chapter 17: Particle Filtering
Abstract
17.1 Introduction
17.2 Sequential Importance Sampling
17.3 Kalman and Particle Filtering
17.4 Particle Filtering
Problems
Chapter 18: Neural Networks and Deep Learning
Abstract
18.1 Introduction
18.2 The Perceptron
18.3 Feed-Forward Multilayer Neural Networks
18.4 The Backpropagation Algorithm
18.5 Pruning the Network
18.6 Universal Approximation Property of Feed-Forward Neural Networks
18.7 Neural Networks: A Bayesian Flavor
18.8 Learning Deep Networks
18.9 Deep Belief Networks
18.10 Variations on the Deep Learning Theme
18.11 Case Study: A Deep Network for Optical Character Recognition
18.12 CASE Study: A Deep Autoencoder
18.13 Example: Generating Data via a DBN
Problems
MATLAB Exercises
Chapter 19: Dimensionality Reduction and Latent Variables Modeling
Abstract
19.1 Introduction
19.2 Intrinsic Dimensionality
19.3 Principle Component Analysis
19.4 Canonical Correlation Analysis
19.5 Independent Component Analysis
19.6 Dictionary Learning: The k-SVD Algorithm
19.7 Nonnegative Matrix Factorization
19.8 Learning Low-Dimensional Models: A Probabilistic Perspective
19.9 Nonlinear Dimensionality Reduction
19.10 Low-Rank Matrix Factorization: A Sparse Modeling Path
19.11 A Case Study: fMRI Data Analysis
Problems
Appendix A: Linear Algebra
A.1 Properties of Matrices
A.2 Positive Definite and Symmetric Matrices
A.3 Wirtinger Calculus
Appendix B: Probability Theory and Statistics
B.1 Cramér-Rao Bound
B.2 Characteristic Functions
B.3 Moments and Cumulants
B.4 Edgeworth Expansion of a pdf
Appendix C: Hints on Constrained Optimization
C.1 Equality Constraints
C.2 Inequality Constrains
Index
No. of pages: 1062
Language: English
Published: March 27, 2015
Imprint: Academic Press
Hardback ISBN: 9780128015223
eBook ISBN: 9780128017227
ST
Sergios Theodoridis
Sergios Theodoridis is professor of machine learning and signal processing with the National and Kapodistrian University of Athens, Athens, Greece and with the Chinese University of Hong Kong, Shenzhen, China.
He has received a number of prestigious awards, including the 2014 IEEE Signal Processing Magazine Best Paper Award, the 2009 IEEE Computational Intelligence Society Transactions on Neural Networks Outstanding Paper Award, the 2017 European Association for Signal Processing
(EURASIP) Athanasios Papoulis Award, the 2014 IEEE Signal Processing Society Education Award, and the 2014 EURASIP Meritorious Service Award. He has served as president of EURASIP and vice president for the IEEE Signal Processing Society and as Editor-in-Chief IEEE Transactions on Signal processing. He is a Fellow of EURASIP and a Life Fellow of IEEE.
He is the coauthor of the best selling book Pattern Recognition, 4th edition, Academic Press, 2009 and of the book Introduction to Pattern Recognition: A MATLAB Approach, Academic Press, 2010.
Affiliations and expertise
professor of machine learning and signal processing with the National and Kapodistrian University of Athens, Athens, Greece and with the Chinese University of Hong Kong, Shenzhen, China.