Treffer: Linear and Deep Models Basics with Pytorch, Numpy, and Scikit-Learn
Weitere Informationen
This book is an introduction to computational statistics for the generalized linear models (glm) and to machine learning with the python language. Extensions of the glm with nonlinearities come from hidden layer(s) within a neural network for linear and nonlinear regression or classification. This allows to present side by side classical statistics and current deep learning. The loglikelihoods and the corresponding loss functions are explained. The gradient and hessian matrix are discussed and implemented for these linear and nonlinear models. Several methods are implemented from scratch with numpy for prediction (linear, logistic, poisson regressions) and for reduction (principal component analysis, random projection). The gradient descent, newton-raphson, natural gradient and l-fbgs algorithms are implemented. The datasets in stake are with 10 to 10^7 rows, and are tabular such that images or texts are vectorized. The data are stored in a compressed format (memmap or hdf5) and loaded by chunks for several case studies with pytorch or scikit-learn. Pytorch is presented for training with minibatches via a generic implementation for studying with computer programs. Scikit-learn is presented for processing large datasets via the partial fit, after the small examples. Sixty exercises are proposed at the end of the chapters with selected solutions to go beyond the contents. Code available at https://github.com/rpriam/book1