First published 2017 in Great Britain and the United States by ISTE Ltd and John Wiley & Sons, Inc.
Apart from any fair dealing for the purposes of research or private study, or criticism or review, as permitted under the Copyright, Designs and Patents Act 1988, this publication may only be reproduced, stored or transmitted, in any form or by any means, with the prior permission in writing of the publishers, or in the case of reprographic reproduction in accordance with the terms and licenses issued by the CLA. Enquiries concerning reproduction outside these terms should be sent to the publishers at the undermentioned address:
ISTE Ltd
27-37 St George’s Road
London SW19 4EU
UK
www.iste.co.uk
John Wiley & Sons, Inc.
111 River Street
Hoboken, NJ 07030
USA
www.wiley.com
© ISTE Ltd 2017
The rights of Maurice Charbit to be identified as the author of this work have been asserted by him in accordance with the Copyright, Designs and Patents Act 1988.
Library of Congress Control Number: 2016955620
British Library Cataloguing-in-Publication Data
A CIP record for this book is available from the British Library
ISBN 978-1-78630-126-0
This book addresses the fundamental bases of statistical inferences. We shall presume throughout that readers have a good working knowledge of Python® language and of the basic elements of digital signal processing.
The most recent version is Python® 3.x, but many people are still working with Python® 2.x versions. All codes provided in this book work with both these versions. The official home page of the Python® Programming Language is https://www.python.org/. Spyder® is a useful open-source integrated development environment (IDE) for programming in the Python® language. Briefly, we suggest to use the Anaconda Python distribution, which includes both Python® and Spyder®. The Anaconda Python distribution is located at https://www.continuum.io/downloads/.
The large part of the examples given in this book mainly use the modules numPy
, which provides powerful numerical arrays objects, Scipy
with high-level data processing routines, such as optimization, regression, interpolation and Matplotlib
for plotting curves, histograms, Box and Whiskers plots, etc. See a list of useful functions p. xiii.
A brief outline of the contents of the book is given below.
In the first chapter, a short review of probability theory is presented, focusing on conditional probability, projection theorem and random variable transformation. A number of statistical elements will also be presented, including the great number law and the limit-central theorem.
The second chapter is devoted to statistical inference. Statistical inference consists of deducing some features of interest from a set of observations to a certain confidence level of reliability. This refers to a variety of techniques. In this chapter, we mainly focus on hypothesis testing, regression analysis, parameter estimation and determination of confidence intervals. Key notions include the Cramer–Rao bound, the Neyman–Pearson theorem, likelihood ratio tests, the least squares method for linear models, the method of moments and the maximum likelihood approach. The least squares method is a standard approach in regression analysis, and it is discussed in detail.
In many problems, the variables of interest are only partially observed. Hidden Markov models (HMM) are well suited to accommodate this kind of problem. Their applications cover a wide range of fields, such as speech processing, handwriting recognition, the DNA analysis and monitoring and control. There are several issues with HMM inference. The key algorithms are the well-known Kalman filter, the Baum–Welch algorithm and the Viterbi algorithm to list only the most famous ones.
Monte-Carlo methods refer to a broad class of algorithms that serve to perform quantities of interest. Typically, the quantities are integrals, i.e. the expectations of a given function. The key idea is using random sequences instead of deterministic sequences to achieve this result. The main issues are first the choice of the most appropriate random mechanism and, second, how to generate such a mechanism. In Chapter 4, the acceptance–rejection method, the Metropolis–Hastings algorithm, the Gibbs sampler, the importance sampling method, etc., are presented.
Maurice CHARBIT
October 2016
∅ | empty set | |
_{A}(x) | = | |
(a,b] | = | {x : a < x ≤ b} |
δ(t) | ||
Re(z) | real part of z | |
Im(z) | imaginary part of z | |
i or j | = | |
I_{N} | identity matrix of size N | |
A^{∗} | complex conjugate of A | |
A^{T} | transpose of A | |
A^{H} | transpose-conjugate of A | |
A^{−1} | inverse matrix of A | |
A^{#} | pseudo-inverse matrix of A | |
r.v./rv | random variable | |
ℙ | probability measure | |
ℙ_{θ} | probability measure indexed by θ | |
{X} | expectation of X | |
B_{θ} X | expectation of X under ℙ_{θ} | |
X_{c} = X − {X} | zero-mean random variable | |
var (X) = {|X_{c}|^{2}} | variance of X | |
cov (X, Y ) = | covariance of (X,Y) | |
cov (X) = cov (X, X) = var (X) | variance of X | |
{X|Y} | conditional expectation of X given Y | |
a converges in distribution to b | ||
a converges in probability to b | ||
a converges almost surely to b | ||
d.o.f. | degree of freedom | |
ARMA | AutoRegressive Moving Average | |
AUC | Area Under the ROC curve | |
c.d.f. | Cumulative Density Function | |
CRB | Cramer Rao Bound | |
EM | Expectation Maximization | |
GLRT | Generalized Likelihood Ratio Test | |
GEM | Generalized Expectation Maximization | |
GMM | Gaussian Mixture Model | |
HMM | Hidden Markov Model | |
i.i.d./iid | independent and identically distributed | |
LDA | Linear Discriminant Analysis | |
MC | Monte-Carlo | |
MLE | Maximum Likelihood Estimator | |
MME | Moment Method Estimator | |
MSE | Mean Square Error | |
OLS | Ordinary Least Squares | |
PCA | Principal Component Analysis | |
p.d.f. | Probability Density Function | |
ROC | Receiver Operational Characteristic | |
SNR | Signal to Noise Ratio | |
WLS | Weighted Least Squares |
To get function documentation, use .__doc__
, e.g. print(range.__doc__)
, or help
, e.g. help(zeros)
or help(’def’)
, or ?
, e.g. range.count?
def
: introduces a function definitionif, else, elif
: an if statement consists of a Boolean expression followed by one or more statementsfor
: executes a sequence of statements multiple timeswhile
: repeats a statement or group of statements while a given condition is true1j
or complex
: returns complex value, e.g. a=1.3+1j*0.2
or a=complex(1.3,0.2)
Methods:
A=array([0,4,12,3])
, then type A.
and tab
, it follows a lot of methods, e.g. the argument of the maximum using A.argmax
. For help type, e.g. A.dot?
.Functions:
int
: converts a number or string to an integerlen
: returns the number of items in a containerrange
: returns an object that produces a sequence of integerstype
: returns the object typeFrom numpy
:
abs
: returns the absolute value of the argumentarange
: returns evenly spaced values within a given intervalargwhere
: finds the indices of array elements that are non-zero, grouped by elementarray
: creates an arraycos, sin, tan
: respectively calculate the cosine, the sine and the tangentcosh
: calculates the hyperbolic cosinecumsum
: calculates the cumulative sum of array elementsdiff
: calculates the n-th discrete difference along a given axisdot
: product of two arraysexp, log
: respectively calculate the exponential, the logarithmfft
: calculates the fftisinf
: tests element-wise for positive or negative infinityisnan
: tests element-wise for nanlinspace
: returns evenly spaced numbers over a specified intervalloadtxt
: loads data from a text filematrix
: returns a matrix from an array-like object, or from a string of datamax
: returns the maximum of an array or maximum along an axismean, std
: respectively return the arithmetic mean and the standard deviationmin
: returns the minimum of an array or maximum along an axisnanmean, nanstd
: respectively return the arithmetic mean and the standard deviation along a given axis while ignoring NaNsnansum
: sum of array elements over a given axis, while ignoring NaNsones
: returns a new array of given shape and type, filled with onespi
: 3.141592653589793setdiff1d
: returns the sorted, unique values of one array that are not in the othersize
: returns the number of elements along a given axissort
: returns a sorted copy of an arraysqrt
: computes the positive square-root of an arraysum
: sum of array elements over a given axiszeros
: returns a new array of given shape and type, filled with zeroesFrom numpy.linalg:
eig
: computes the eigenvalues and right eigenvectors of a square arraypinv
: computes the (Moore–Penrose) pseudo-inverse of a matrixinv
: computes the (multiplicative) inverse of a matrixsvd
: computes Singular Value DecompositionFrom numpy.random:
rand
: draws random samples from a uniform distribution over (0, 1)randn
: draws random samples from the “standard normal” distributionrandint
: draws random integers from ‘low’ (inclusive) to ‘high’ (exclusive)From scipy
:
(for the random distributions, use the methods .pdf, .cdf, .isf, .ppf
, etc.)
norm
: Gaussian random distributiongamma
: gamma random distributionf
: Fisher random distributiont
: Student’s random distributionchi2
: chi-squared random distributionFrom scipy.linalg:
sqrtm
: computes matrix square rootFrom matplotlib.pyplot:
box, boxplot, clf, figure, hist, legend, plot, show, subplot
title, txt, xlabel, xlim, xticks, ylabel, ylim, yticks
Datasets:
statsmodels.api.datasets.co2, statsmodels.api.datasets.nile, statsmodels.api.datasets.star98, statsmodels.api.datasets.heart
sklearn.datasets.load_boston, sklearn.datasets.load_diabetes
scipy.misc.ascent
From sympy
:
Symbol, Matrix, diff, Inverse, trace, simplify