Contents:


Chapter 1: Introduction

  • Motivation towards statistical learning and belief in data.
  • What's next.

Chapter 2: Overview of Supervised Learning

  • Variable types and terminology
    • Quantitative vs Qualitative output.
    • Regression and Classification
  • Simple approaches : Least Squares and Nearest Neighbors
    • Linear Models and Least Squares
      \(\hat Y = \hat \beta_0 + \sum_{j=1}^pX_j\hat\beta_j\)
      • Least squares by solving normal equations.
    • Nearest Neighbor Methods
      • Voronoi tessellation
    • From Least Squares to Nearest Neighbors
  • Statistical Decision Theory
  • Local Methods in High Dimensions
    • The curse of Dimensionality,Bellman
  • Statistical Models, Supervised Learning and Function Approximation
    • A Statistical Model for the Joint Distribution Pr(X, Y )
    • Supervised Learning
    • Function Approximation
  • Structured Regression Models
    • Difficulty of the Problem
  • Classes of Restricted Estimators
    • Roughness Penalty and Bayesian Methods
      • regularization
    • Kernel Methods and Local Regression
    • Basis Functions and Dictionary Methods
  • Model Selection and the Bias–Variance Tradeoff
    Bias-Var

Chapter 3: Linear Methods Of Regression

  • Introduction
  • Linear Regression Models and Least Squares
    • Solution from normal form
    • F statistic
    • Example : prostrate cancer
    • The Gauss-Markov Theorem
      • Proof that the Least Squares estimate for the parameters, \(\beta\) has the least variance.
    • Multiple Regression from Simple Univariate Regression Alg 3.1
    • Multiple Outputs
  • Subset Selection
    • Best-Subset Selection
    • Forward and Backward-Stepwise Selection
    • Forward-Stagewise Selection
    • Example : Prostrate Cancer (Continued)
  • Shrinkage Methods
    • Ridge Regression : L2 regularization
    • The Lasso : L1 regularization
    • Discussion : Subset Selection, Ridge Regression and the Lasso
    • Least Angle Regression
  • Methods Using Derived Input Directions
    • Principal Components Regression
    • Partial Least Squares
  • Discussion : A Comparison of Selection and Shrinkage Methods
  • Multiple Outcomes Shrinkage and Selection ☠
  • More on Lasso and Related Path Algorithms ☠
    • Incremental Forward Stagewise Regression
    • Piecewise-Linear Path Algorithms
    • The Dantzig selector
    • The Grouped Lasso
    • Further Properties of Lasso
    • Pathwise Coordinate Optimization
  • Computational Considerations
    • Fitting is usually done using Cholesky decomposition of matrix \(X^TX\).

Chapter 4: Linear Methods of Classification

  • Introduction
  • Linear Regression of an Indicator Matrix
  • Linear Discriminant Analysis
    • Regularized Discriminant Analysis
    • Computations for LDA
    • Reduced-Rank Linear Discriminant Analysis
  • Logistic Regression
    • Fitting Logistic Regression Models
    • Example : South African Heart Disease
    • Quadratic Approximations and Inference
    • \(L_1\) Regularized Logistic Regression
    • Logistic Regression or LDA ?
  • Separating Hyperplanes
    • Rosenblatt’s Perceptron Learning Algorithm
    • Optimal Separating Hyperplanes ☠

Chapter 5: Basis Expansions and Regularization

  • Introduction
  • Piecewise Polynomials and Splines
    • Natural Cubic Splines
    • Example: South African Heart Disease (Continued)
    • Example: Phoneme Recognition
  • Filtering and Feature Extraction
  • Smoothing Splines
    • Degrees of Freedom and Smoother Matrices
  • Automatic Selection of the Smoothing Parameters
    • Fixing the Degrees of Freedom
    • The Bias–Variance Tradeoff
  • Nonparametric Logistic Regression
  • Multidimensional Splines
  • Regularization and Reproducing Kernel Hilbert Spaces ☠
    • Spaces of Functions Generated by Kernels
    • Examples of RKHS
    • Penalized Polynomial Regression
      • Gaussian Radial Basis Functions
      • Support Vector Classifiers
  • Wavelet Smoothing ☠
    • Wavelet Smoothing and the Wavelet Transform
    • Adaptive Wavelet Filtering

Chapter 6: Kernel Smoothing Methods

  • One-Dimensional Kernel Smoothers
    • Local Linear Regression
    • Local Polynomial Regression
  • Selecting the Width of the Kernel
  • Local Regression in \({\mathbb R}^p\)
  • Structured Local Regression Models in \({\mathbb R}^p\)
    • Structured Kernels
    • Structured Regression Functions
  • Kernel Density Estimation and Classification
    • Kernel Density Estimation
    • Kernel Density Classification
    • The Naive Bayes Classifier
  • Radial Basis Functions and Kernels
  • Mixture Models for Density Estimation and Classification
  • Computational Considerations


Comments

comments powered by Disqus