%
%                           S T A T L I B
%
%                           by John Walker
%                      http://www.fourmilab.ch/
%
%   What's all this, you ask?  Well, this is a "literate program",
%   written in the CWEB language created by Donald E. Knuth and
%   Silvio Levy.  This file includes both the C source code for
%   the program and internal documentation in TeX.  Processing
%   this file with the CTANGLE utility produces the C source file,
%   while the CWEAVE program emits documentation in TeX.  The
%   current version of these programs may be downloaded from:
%
%       http://www-cs-faculty.stanford.edu/~knuth/cweb.html
%
%   where you will find additional information on literate
%   programming and examples of other programs written in this
%   manner.
%
%   If you don't want to wade through all these details, don't
%   worry; this distribution includes a .cc file already
%   extracted and ready to compile.  If "make" complains that it
%   can't find "ctangle" or "cweave", just "touch *.cc"
%   and re-make--apparently the process of extracting the files
%   from the archive messed up the date and time, misleading
%   make into believing it needed to rebuild those files.

@** Introduction.

\vskip 15pt
\centerline{\ttitlefont STATLIB}
\vskip 10pt
\centerline{\titlefont Probability and Statistics Library}
\vskip 15pt
%\centerline{by John Walker}
%\centerline{\.{http://www.fourmilab.ch/}}
\centerline{\pdfURL{John Walker}{http://www.fourmilab.ch/}}

\vskip 15pt
\centerline{This program is in the public domain.}

\vskip 15pt
This program provides facilities for computations involving
probability and statistics.  A variety of probability distributions,
both discrete and continuous, are provided along with a template
class |dataTable| which implements a variety of descriptive
statistical measures on an arbitrary numeric data type.

This program requires none of the other components of the analysis
software suite and may be extracted as a general statistics library.
It uses the \.{DCDFLIB} package for its low-level computations.

\vskip 2ex
\noindent
$\underline{\hbox{\bf References}}$

\vskip 1ex
\noindent
Abramowitz, Milton and Irene A. Stegun.
{\it Handbook of Mathematical Functions}
(Chapter 26).  New York: Dover, 1974.
ISBN 0-486-61272-4.

\vskip 1ex
\noindent
Montgomery, Douglas C. and George C. Runger.
{\it Applied Statistics and Probability for Engineers.}
New York: John Wiley \& Sons, 1994.  ISBN 0-471-54041-2.

\vskip 1ex
\noindent
Wolfram Research.
{\it Mathematica 4.0 Standard Add-On Packages.}
Champaign, IL: Wolfram Media, 1999.  ISBN 1-57955-007-X.

\vskip 30pt

@(statlib_test.cc@>=
#define REVDATE "15th February 2003"

@** Program global context.

@q Define standard C++ namespaces and classes @>
@i "cweb/c++lib.w"

@c
#include "config.h"                   /* System-dependent configuration */

@h

@<Application include files@>@/
@<Class implementations@>@/

@
We export the class definitions for this package in the
external file \.{statlib.h} that programs which use this
library may include.

@(statlib.h@>=

#ifndef STATLIB_HEADER_DEFINES
#define STATLIB_HEADER_DEFINES
#include <math.h>       // Make sure \.{math.h} is available
#include <assert.h>

#include <iostream>
#include <exception>
#include <stdexcept>
#include <string>
#include <vector>
#include <algorithm>
using namespace std;@#
@<Class definitions@>
#endif

@
The following include files provide access to external components
of the program not defined herein.

@<Application include files@>=
#include "dcdflib.h"        // DCDFlib Cumulative Distribution Function library
#include "statlib.h"        // Class definitions for this package

@
The following classes are defined and their
implementations provided.

@<Class implementations@>=
@<Probability distributions@>@/

@** Probability distributions.

We provide the class definitions
for computations involving random variables with the
following distributions:

\vskip 1ex
\hskip 8em{\vbox{
\settabs 3 \columns
\+{\bf normalDistribution}&Normal (Gaussian) distribution\cr
\+{\bf poissonDistribution}&Poisson distribution\cr
\+{\bf chiSquareDistribution}&Chi-Square ($\chi^2$) distribution\cr
\+{\bf gammaDistribution}&Gamma distribution\cr
\+{\bf betaDistribution}&Beta distribution\cr
\+{\bf tDistribution}&Student's $t$ distribution\cr
\+{\bf FDistribution}&$F$ distribution\cr
\+{\bf binomialDistribution}&Binomial distribution\cr
\+{\bf negativeBinomialDistribution}&Negative binomial distribution\cr
}
}
\vskip 1ex

Computations are performed using the \.{DCDFLIB}
Double Precision Cumulative Distribution Function Library
developed by the
Section of Computer Science, Department of Biomathematics, University of Texas M. D.
Anderson Hospital.  Source code for \.{DCDFLIB} may be
downloaded via:\hfill\break
\centerline{\pdfURL{anonymous FTP}{ftp://odin.mda.uth.tmc.edu/pub/unix/dcdflib.c.tar.Z}}

The abstract superclass |probabilityDistribution| is the parent
of all of the specific distributions, discrete and continuous.
It implements properties common to all distributions and
defines pure virtual functions which all specific distributions
must implement.

Quantities named in the the following methods are as
follows:

\vskip 1ex
\settabs 6 \columns
\+&\.{P}&$P$&Cumulative probability distribution function\cr
\+&\.{Q}&$Q$&``Upper tail'' probability distribution ($Q=1-P$)\cr
\+&\.{x}&$x$&Value of random variable\cr
\+&\.{mean}&$\mu$&Mean value of distribution\cr
\+&\.{stdev}&$\sigma$&Standard deviation\cr
\+&\.{variance}&$\sigma^2$&Variance\cr
\+&\.{skewness}&$\gamma_1$&Coefficient of skewness\cr
\+&\.{kurtosisExcess}&$\gamma_2$&Kurtosis excess\cr
\+&\.{kurtosis}&$\beta_2$&Kurtosis\cr
\vskip 1ex

@<Class definitions@>=

class probabilityDistribution {
private:@/
    virtual string distributionName(void) = 0;  // Return distribution name

public:@/
    static double Q_from_P(double x) { // Upper tail probability: $Q=1-P$
        return 1 - x;
    }

    static double P_from_Q(double x) { // CDF from upper tail probability: $P=1-Q$
        return 1 - x;
    }

    virtual double mean(void) = 0;    // Return mean of distribution
    virtual double stdev(void) = 0;   // Return standard deviation $\sigma$ of distribution
    double variance(void) {           // Return variance $\sigma^2$ of deviation
        return stdev() * stdev();
    }
    virtual double skewness(void) = 0; // Skewness $\gamma_1$ of distribution
    virtual double kurtosisExcess(void) = 0; // Kurtosis excess $\gamma_2$ of distribution
    double kurtosis(void) {            // Kurtosis $(\beta_2=\gamma_2+3$) of distribution
        return kurtosisExcess() + 3;
    }

    virtual double CDF_P(double x) = 0; // Compute CDF $P=\int_{-\infty}^{|x|}f(x)\,dx$

    double CDF_Q(double x) {          // $Q=1-P$
        return Q_from_P(CDF_P(x));
    }

    //  Write description of distribution to output stream
    virtual void writeParameters(ostream &of) {
        of << "Distribution: " << distributionName() << "\n";
        of << "    Mean = " << mean() << "  Stdev = " << stdev() << 
              "  Variance = " << variance() << "  Skewness = " <<
              skewness() << "\n";
        of << "    Kurtosis = " << kurtosis() <<
              "  KurtosisExcess = " << kurtosisExcess() << "\n";
    }
};

@*1 Normal (Gaussian) distribution.

A {\it normal distribution} describes a random variable $x$
with probability distribution function
$$f_x(x;\mu,\sigma)={1\over\sqrt{2\pi\sigma}}e^{-\Bigl({{(x-\mu)^2}\over{2\sigma^2}}\Bigr)},
\qquad -\infty<x<\infty$$
where $\mu$ is the {\it mean value} of the random variable and
$\sigma$ is the {\it standard deviation}.  A {\it standard normal
distribution} is one with $\mu=0$ and $\sigma=1$.

A binomial distribution approaches the normal distribution as a
limit as the number of trials becomes large.  For $n$ trials with
probability $0.5$, the binomial distribution is approximated
by a normal distribution with $\mu=n/2$ and $\sigma=\sqrt{n/4}$.

@<Class definitions@>=

class normalDistribution : public probabilityDistribution {

private:@/
    virtual string distributionName(void) {
        return "normal";
    }

    double mu;                        // Mean value
    double sigma;                     // Standard deviation

public:@/
    /*  The default constructor creates a standard
        normal distribution: $\mu=0$, $\sigma=1$.  */
    normalDistribution() {
        set_mu_sigma();
    }

    /*  The following constructor creates a normal distribution
        with a given mean and standard deviation.  */

    normalDistribution(double c_mu, double c_sigma) {
        set_mu_sigma(c_mu, c_sigma);
    }

    //  Accessors

    void set_mu(double c_mu = 0) {
        mu = c_mu;
    }

    void set_sigma(double c_sigma = 1) {
        sigma = c_sigma;
    }

    void set_mu_sigma(double c_mu = 0, double c_sigma = 1) {
        set_mu(c_mu);
        set_sigma(c_sigma);
    }

    double get_mu(void) {
        return mu;
    }

    double get_sigma(void) {
        return sigma;
    }

    //  Implementations of abstract virtual functions

    double mean(void) {
        return get_mu();
    }

    double stdev(void) {
        return get_sigma();
    }

    double skewness(void) {
        return 0;                     // $\gamma_1=0$ for normal distribution
    }

    double kurtosisExcess(void) {
        return 0;                     // $\gamma_2=0$ for normal distribution
    }

    //  Auxiliary accessors

    @<Set standard deviation from variance@>;
    @<Approximate binomial distribution@>;

    //  Static transformation functions

    static double p_from_mu_sigma_x(double mu, double sigma, double x);
    static double x_from_p_mu_sigma(double p, double mu, double sigma);
    static double mu_from_p_x_sigma(double p, double x, double sigma);
    static double sigma_from_p_x_mu(double p, double x, double mu);

    //  Analysis methods

    @<Compute $z$ score@>;

    double CDF_P(double x);           // $P=\int_{-\infty}^{|x|}f(x)\,dx$
};

@
The standard deviation, $\sigma$, is the square root
of the variance.  You can set the standard deviation
of the distribution from the variance with the
|set_variance| method.

@<Set standard deviation from variance@>=
    void set_variance(double var) {
        set_sigma(sqrt(var));
    }

@
A binomial distribution for a large number of trials $n$ each with
probability $0.5$ is approximated by a normal distribution
with $\mu=n/2$ and $\sigma=\sqrt{n/4}$.  The
|approximateBinomial| method allows you to create such a
normal distribution by specifying the number of trials $n$.

@<Approximate binomial distribution@>=
void approximateBinomial(unsigned int n)
{
    set_mu_sigma(n / 2, sqrt(n / 4.0));
}

@
The $z$ score for a given value in a normal distribution is
given by $$z={{m-\mu}\over{\sigma}}.$$
These $z$ values obey the $\chi^2$ distribution as does the summation
of a number of independent $z$ scores.

@<Compute $z$ score@>=
    double z_score(double m) {
        return (m - mean()) / stdev();
    }

@
The following transformation functions are |static| methods of
the |normalDistribution| class.  They permit calculation of
properties of the normal distribution without reference to a
particular |normalDistribution| object.

The functions operate on the following quantities:

\vskip 1ex
\settabs 8 \columns
\+&\.{p}&$P$&Cumulative probability distribution\cr
\+&\.{x}&$x$&Value of random variable\cr
\+&\.{mu}&$\mu$&Mean value\cr
\+&\.{sigma}&$\sigma$&Standard deviation\cr
\vskip 1ex

@<Probability distributions@>=
double normalDistribution::p_from_mu_sigma_x(double mu, double sigma, double x) {
    double p, q, bound;
    int which = 1, status;

    cdfnor(&which, &p, &q, &x, &mu, &sigma, &status, &bound);
    if (status != 0) {
        throw(out_of_range("normalDistribution::p_from_mu_sigma_x: Result out of bounds"));
    }
    return p;
}

double normalDistribution::x_from_p_mu_sigma(double p, double mu, double sigma) {
    double q, x, bound;
    int which = 2, status;

    q = 1 - p;
    cdfnor(&which, &p, &q, &x, &mu, &sigma, &status, &bound);
    if (status != 0) {
        throw(out_of_range("normalDistribution::x_from_p_mu_sigma: Result out of bounds"));
    }
    return x;
}

double normalDistribution::mu_from_p_x_sigma(double p, double x, double sigma) {
    double q, mu, bound;
    int which = 3, status;

    q = 1 - p;
    cdfnor(&which, &p, &q, &x, &mu, &sigma, &status, &bound);
    if (status != 0) {
        throw(out_of_range("normalDistribution::mu_from_p_x_sigma: Result out of bounds"));
    }
    return mu;
}

double normalDistribution::sigma_from_p_x_mu(double p, double x, double mu) {
    double q, sigma, bound;
    int which = 4, status;

    q = 1 - p;
    cdfnor(&which, &p, &q, &x, &mu, &sigma, &status, &bound);
    if (status != 0) {
        throw(out_of_range("normalDistribution::sigma_from_p_x_mu: Result out of bounds"));
    }
    return sigma;
}

@
The {\it cumulative distribution function} (CDF) $P$ value
is the probability a given value $x$ of the random
variable will occur.  It is computed by integrating the
probability density function $f(x)$ from
$-\infty$ to $x$:
$$P=\int_{-\infty}^{|x|}f(x)\,dx.$$

@<Probability distributions@>=
double normalDistribution::CDF_P(double x) {
    return p_from_mu_sigma_x(mu, sigma, x);
}

@*1 Chi-Square ($\chi^2$) distribution.

A random variable has a {\it chi-square} ($\chi^2$) distribution
if it follows the probability distribution function
$$f(x)={1\over{2^{k/2}\Gamma{\Bigl({k\over2}\Bigr)}}}x^{(k/2)-1}e^{-x/2}, \qquad x>0$$
where $k$ is the {\it degrees of freedom} of the distribution and
the $\Gamma$ is the gamma function---the generalisation of
factorial to real arguments
$$\Gamma(n)=\int_0^\infty x^{n-1}e^{-x}\,dx, \qquad n>0.$$

The $\chi^2$ distribution is a special case of the
{\it gamma distribution}
$$f_x(x; \lambda, r)={{\lambda^r x^{r-1} e^{-\lambda x}}\over{\Gamma(r)}},
 \qquad x>0, \lambda>0, r>0$$
when $\lambda=1/2$ and $r$ is the degrees of freedom $k$
divided by 2.

The $\chi^2$ distribution is applicable when combining the results of
a series of independent experiments whose outcomes are normally
distributed.  If $Z_1, Z_2,\ldots Z_k$ are the results
of $k$ independent experiments normalised so that their mean $\mu=0$ and
standard deviation $\sigma=1$, then the sum $X$ of the squares of
these outcomes:
$$X=\sum_{i=1}^k {Z_i}^2$$
will be $\chi^2$ distributed with $k$ degrees
of freedom, $\chi_k^2$.

@<Class definitions@>=

class chiSquareDistribution : public probabilityDistribution {

private:@/
    virtual string distributionName(void) {
        return "chi-square";
    }

    double k;                         // Degrees of freedom

public:@/
    /*  The default constructor creates a $\chi^2$
        distribution with $k=1$.  */
    chiSquareDistribution() {
        set_k();
    }

    chiSquareDistribution(double c_k = 1) {
        set_k(c_k);
    }

    //  Accessors

    void set_k(double c_k = 1) {
        k = c_k;
    }

    double get_k(void) {
        return k;
    }

    //  Implementations of abstract virtual functions

    double mean(void) {
        return get_k();
    }

    double stdev(void) {
        return sqrt(2 * k);           // $\sigma=\sqrt{2 k}$
    }

    double skewness(void) {
        return sqrt(8 / k);           // $\gamma_1=\sqrt{8/k}$
    }

    double kurtosisExcess(void) {
        return 12 / k;                // $\gamma_2=12/k$
    }

    //  Static transformation functions

    static double p_from_k_x(double k, double x);
    static double x_from_p_k(double p, double k);
    static double k_from_p_x(double p, double x);

    //  Analysis methods

    double CDF_P(double x);           // $P=\int_{-\infty}^{|x|}f(x)\,dx$

    //  Write description of distribution to output stream
    void writeParameters(ostream &of) {
        probabilityDistribution::writeParameters(of);
        of << "    Degrees of freedom = " << k << "\n";
    }
};

@
The following transformation functions are |static| methods of
the |chiSquareDistribution| class.  They permit calculation of
properties of the $\chi^2$ distribution without reference to a
particular |chiSquareDistribution| object.

The functions operate on the following quantities:

\vskip 1ex
\settabs 8 \columns
\+&\.{p}&$P$&Cumulative probability distribution\cr
\+&\.{x}&$x$&Value of random variable\cr
\+&\.{k}&$k$&Degrees of freedom\cr
\vskip 1ex

@<Probability distributions@>=
double chiSquareDistribution::p_from_k_x(double k, double x) {
    double p, q, bound;
    int which = 1, status;

    cdfchi(&which, &p, &q, &x, &k, &status, &bound);
    if (status != 0) {
        throw(out_of_range("chiSquareDistribution::p_from_k_x: Result out of bounds"));
    }
    return p;
}

double chiSquareDistribution::x_from_p_k(double p, double k) {
    double q, x, bound;
    int which = 2, status;

    q = 1 - p;
    cdfchi(&which, &p, &q, &x, &k, &status, &bound);
    if (status != 0) {
        throw(out_of_range("chiSquareDistribution::x_from_p_k: Result out of bounds"));
    }
    return x;
}

double chiSquareDistribution::k_from_p_x(double p, double x) {
    double q, k, bound;
    int which = 3, status;

    q = 1 - p;
    cdfchi(&which, &p, &q, &x, &k, &status, &bound);
    if (status != 0) {
        throw(out_of_range("chiSquareDistribution::k_from_p_x: Result out of bounds"));
    }
    return k;
}

@
The {\it cumulative distribution function} (CDF) $P$ value
is the probability a given value $x$ of the random
variable will occur.  It is computed by integrating the
probability density function $f(x)$ from
$-\infty$ to $x$:
$$P=\int_{-\infty}^{|x|}f(x)\,dx.$$

@<Probability distributions@>=
double chiSquareDistribution::CDF_P(double x) {
    return p_from_k_x(k, x);
}

@*1 Gamma distribution.

The
probability density function of the Gamma distribution is
$$f_x(x;\alpha,\lambda)={{\lambda^{\alpha}x^{\alpha-1}e^{-\lambda x}}\over{\Gamma(\alpha)}}
                                                     , \qquad x > 0 .$$
$\alpha$ is referred to as the {\it shape parameter} and
$\lambda$ the {\it scale parameter}.  Some references
use $r$ instead of $\alpha$ for the shape parameter, and
others specify the scale parameter as the reciprocal
of the value used here.
A chi-square distribution with $k$ degrees of freedom
is a special case of a gamma distribution
with $\lambda=1/2$ and $r=k/2$.

@<Class definitions@>=

class gammaDistribution : public probabilityDistribution {

private:@/
    virtual string distributionName(void) {
        return "gamma";
    }

    double alpha;                     // Shape parameter $\alpha$
    double lambda;                    // Scale parameter $\lambda$

public:@/
    /*  The default constructor creates a gamma
        distribution with $\alpha=10$, $\lambda=10$.  */
    gammaDistribution() {
        set_alpha_lambda();
    }

    gammaDistribution(double c_alpha = 10, double c_lambda = 10) {
        set_alpha_lambda(c_alpha, c_lambda);
    }

    //  Accessors

    void set_alpha(double c_alpha = 10) {
        alpha = c_alpha;
    }

    double get_alpha(void) {
        return alpha;
    }

    void set_lambda(double c_lambda = 10) {
        lambda = c_lambda;
    }

    double get_lambda(void) {
        return lambda;
    }

    void set_alpha_lambda(double c_alpha = 10, double c_lambda = 10) {
        set_alpha(c_alpha);
        set_lambda(c_lambda);
    }

    //  Implementations of abstract virtual functions

    double mean(void) {
        return alpha / lambda;        // $\mu=\alpha/\lambda$
    }

    double stdev(void) {
        return sqrt(alpha) / lambda;  // $\sigma=\sqrt{\alpha}/\lambda$
    }

    double skewness(void) {
        return 2 / sqrt(alpha);       // $\gamma_1=2/sqrt{\alpha}$
    }

    double kurtosisExcess(void) {
        return 6 / alpha;             // $\gamma_2=6/\alpha$
    }

    //  Static transformation functions

    static double p_from_alpha_lambda_x(double alpha, double lambda, double x);
    static double x_from_p_alpha_lambda(double p, double alpha, double lambda);
    static double alpha_from_p_lambda_x(double p, double lambda, double x);
    static double lambda_from_p_alpha_x(double p, double alpha, double x);

    double CDF_P(double x);           // $P=\int_{-\infty}^{|x|}f(x)\,dx$

    //  Write description of distribution to output stream
    void writeParameters(ostream &of) {
        probabilityDistribution::writeParameters(of);
        of << "    Shape (alpha) = " << alpha << "  Scale (lambda)  = " << lambda << "\n";
    }
};

@
The following transformation functions are |static| methods of
the |gammaDistribution| class.  They permit calculation of
properties of the Gamma distribution without reference to a
particular |gammaDistribution| object.

The functions operate on the following quantities:

\vskip 1ex
\settabs 8 \columns
\+&\.{p}&$P$&Cumulative probability distribution\cr
\+&\.{x}&$x$&Value of random variable\cr
\+&\.{alpha}&$\alpha$&Shape parameter\cr
\+&\.{lambda}&$\lambda$&Scale parameter\cr
\vskip 1ex

@<Probability distributions@>=
double gammaDistribution::p_from_alpha_lambda_x(double alpha, double lambda, double x) {
    double p, q, bound;
    int which = 1, status;

    cdfgam(&which, &p, &q, &x, &alpha, &lambda, &status, &bound);
    if (status != 0) {
        throw(out_of_range("gammaDistribution::p_from_alpha_lambda_x: Result out of bounds"));
    }
    return p;
}

double gammaDistribution::x_from_p_alpha_lambda(double p, double alpha, double lambda) {
    double q, x, bound;
    int which = 2, status;

    q = 1 - p;
    cdfgam(&which, &p, &q, &x, &alpha, &lambda, &status, &bound);
    if (status != 0) {
        throw(out_of_range("gammaDistribution::x_from_p_alpha_lambda: Result out of bounds"));
    }
    return x;
}

double gammaDistribution::alpha_from_p_lambda_x(double p, double lambda, double x) {
    double q, alpha, bound;
    int which = 3, status;

    q = 1 - p;
    cdfgam(&which, &p, &q, &x, &alpha, &lambda, &status, &bound);
    if (status != 0) {
        throw(out_of_range("gammaDistribution::alpha_from_p_lambda_x: Result out of bounds"));
    }
    return alpha;
}

double gammaDistribution::lambda_from_p_alpha_x(double p, double alpha, double x) {
    double q, lambda, bound;
    int which = 4, status;

    q = 1 - p;
    cdfgam(&which, &p, &q, &x, &alpha, &lambda, &status, &bound);
    if (status != 0) {
        throw(out_of_range("gammaDistribution::lambda_from_p_alpha_x: Result out of bounds"));
    }
    return lambda;
}

@
The {\it cumulative distribution function} (CDF) $P$ value
is the probability of $n$ successes in $x$ trials.

@<Probability distributions@>=
double gammaDistribution::CDF_P(double x) {
    return p_from_alpha_lambda_x(alpha, lambda, x);
}

@*1 Beta distribution.

If $X_a$ and $X_b$ are independent gamma distributions with
shape ($\alpha$) parameters $a$ and $b$ respectively and
identical scale ($\lambda$) parameters, then the random variable
$${X_a}\over{X_a+X_b}$$
will obey the {\it beta distribution} whose probability
density function is
$$f(x;a,b)={{(1-x)^{b-1}x^{a-1}}\over{B(a,b)}}$$
where $B(a,b)$ is the {\it Euler beta function}
$$B(a,b)={{\Gamma(a)\Gamma(b)}\over{\Gamma(a+b)}}.$$

@<Class definitions@>=

class betaDistribution : public probabilityDistribution {

private:@/
    virtual string distributionName(void) {
        return "beta";
    }

    double a;                         // Shape parameter of gamma variable $X_1$
    double b;                         // Scale parameter of gamma variable $X_2$

public:@/
    /*  The default constructor creates a beta
        distribution with $a=5$, $b=10$.  */
    betaDistribution() {
        set_a_b();
    }

    betaDistribution(double c_a = 5, double c_b = 10) {
        set_a_b(c_a, c_b);
    }

    //  Accessors

    void set_a(double c_a = 5) {
        a = c_a;
    }

    double get_a(void) {
        return a;
    }

    void set_b(double c_b = 10) {
        b = c_b;
    }

    double get_b(void) {
        return b;
    }

    void set_a_b(double c_a = 5, double c_b = 10) {
        set_a(c_a);
        set_b(c_b);
    }

    //  Implementations of abstract virtual functions

    double mean(void) {
        return a / (a + b);           // $\mu=a/(a+b)$
    }

    double stdev(void) {
        return sqrt(a * b) / ((a + b) * sqrt(a + b + 1));
               // $\sigma={{\sqrt{ab}}\over{(a+b)\sqrt{a+b+1}}}$
    }

    double skewness(void) {
        return (2 * (b - a) * sqrt(a + b + 1)) / (sqrt(a * b) * (a + b + 2));
              // $\gamma_1={{2(b-a)\sqrt{a+b+1}}\over{\sqrt{ab}(a+b+2)}}$
    }

    double kurtosisExcess(void) {
        return ((3 * (a + b + 1) * ((a * b * (a + b - 6)) +
                     (2 * (a + b) * (a + b))) /
                (a * b * (a + b + 2) * (a + b + 3)))) - 3;
               /* $\gamma_2={{3(a+b+1)((ab(a+b-6))+(2(a+b)^2))}\over
                            {ab(a+b+2)(a+b+3)}}-3$ */
    }

    //  Static transformation functions

    static double p_from_a_b_x(double a, double b, double x);
    static double x_from_p_a_b(double p, double a, double b);
    static double a_from_p_b_x(double p, double b, double x);
    static double b_from_p_a_x(double p, double a, double x);

    double CDF_P(double x);           // $P=\int_{-\infty}^{|x|}f(x)\,dx$

    //  Write description of distribution to output stream
    void writeParameters(ostream &of) {
        probabilityDistribution::writeParameters(of);
        of << "    a = " << a << "  b = " << b << "\n";
    }
};

@
The following transformation functions are |static| methods of
the |betaDistribution| class.  They permit calculation of
properties of the Beta distribution without reference to a
particular |betaDistribution| object.

The functions operate on the following quantities:

\vskip 1ex
\settabs 8 \columns
\+&\.{p}&$P$&Cumulative probability distribution\cr
\+&\.{x}&$x$&Value of random variable\cr
\+&\.{a}&$a$&Shape parameter of first Gamma distribution\cr
\+&\.{b}&$b$&Shape parameter second Gamma distribution\cr
\vskip 1ex

@<Probability distributions@>=
double betaDistribution::p_from_a_b_x(double a, double b, double x) {
    double p, q, y, bound;
    int which = 1, status;

    y = 1 - x;
    cdfbet(&which, &p, &q, &x, &y, &a, &b, &status, &bound);
    if (status != 0) {
        throw(out_of_range("betaDistribution::p_from_a_b_x: Result out of bounds"));
    }
    return p;
}

double betaDistribution::x_from_p_a_b(double p, double a, double b) {
    double q, x, y, bound;
    int which = 2, status;

    q = 1 - p;
    cdfbet(&which, &p, &q, &x, &y, &a, &b, &status, &bound);
    if (status != 0) {
        throw(out_of_range("betaDistribution::x_from_p_a_b: Result out of bounds"));
    }
    return x;
}

double betaDistribution::a_from_p_b_x(double p, double b, double x) {
    double a, q, y, bound;
    int which = 3, status;

    q = 1 - p;
    y = 1 - x;
    cdfbet(&which, &p, &q, &x, &y, &a, &b, &status, &bound);
    if (status != 0) {
        throw(out_of_range("betaDistribution::a_from_p_b_x: Result out of bounds"));
    }
    return a;
}

double betaDistribution::b_from_p_a_x(double p, double a, double x) {
    double b, q, y, bound;
    int which = 4, status;

    q = 1 - p;
    y = 1 - x;
    cdfbet(&which, &p, &q, &x, &y, &a, &b, &status, &bound);
    if (status != 0) {
        throw(out_of_range("betaDistribution::b_from_p_a_x: Result out of bounds"));
    }
    return b;
}

@
The {\it cumulative distribution function} (CDF) $P$ value
is the probability of $n$ successes in $x$ trials.

@<Probability distributions@>=
double betaDistribution::CDF_P(double x) {
    return p_from_a_b_x(a, b, x);
}

@*1 Student's $t$ distribution.

If $Z$ is a random variable with a standard normal
distribution ($\mu=0, \sigma=1$) and $V$ is a
chi-square random variable, the random variable
$$T={Z\over\sqrt{V/k}}$$
will have the probability density function
$$f(x)={{\Gamma((k+1)/2)}\over{\sqrt{\pi k}\Gamma(k/2)}}\cdot
       {1\over{((x^2/k)+1)^{(k+1)/2}}}, \qquad -\infty<x<\infty, k>2$$
where $k$ is the {\it degrees of freedom} of the distribution $V$ and
$\Gamma$ is the gamma function.  This is referred to as a
$t$ distribution with $k$ degrees of freedom.

The $t$ distribution is useful in determining confidence
intervals and testing hypotheses about the mean of samples
taken from a normal distribution.

@<Class definitions@>=

class tDistribution : public probabilityDistribution {

private:@/
    virtual string distributionName(void) {
        return "t";
    }

    double k;                         // Degrees of freedom

public:@/
    /*  The default constructor creates a $t$
        distribution with $k=3$.  */
    tDistribution() {
        set_k();
    }

    tDistribution(double c_k = 3) {
        set_k(c_k);
    }

    //  Accessors

    void set_k(double c_k = 3) {
        k = c_k;
    }

    double get_k(void) {
        return k;
    }

    //  Implementations of abstract virtual functions

    double mean(void) {
        return 0;
    }

    double stdev(void) {
        return sqrt(k /(k - 2));      // $\sigma=\sqrt{k/(k-2)}$
    }

    double skewness(void) {
        return 0;                     // $\gamma_1=0$
    }

    double kurtosisExcess(void) {
        return 6.0 / (k - 4);         // $\gamma_2=6/(k - 4)$
    }

    //  Static transformation functions

    static double p_from_k_x(double k, double x);
    static double x_from_p_k(double p, double k);
    static double k_from_p_x(double p, double x);

    //  Analysis methods

    double CDF_P(double x);           // $P=\int_{-\infty}^{|x|}f(x)\,dx$

    //  Write description of distribution to output stream
    void writeParameters(ostream &of) {
        probabilityDistribution::writeParameters(of);
        of << "    Degrees of freedom = " << k << "\n";
    }
};

@
The following transformation functions are |static| methods of
the |tDistribution| class.  They permit calculation of
properties of the $t$ distribution without reference to a
particular |tDistribution| object.

The functions operate on the following quantities:

\vskip 1ex
\settabs 8 \columns
\+&\.{p}&$P$&Cumulative probability distribution\cr
\+&\.{x}&$x$&Value of random variable\cr
\+&\.{k}&$k$&Degrees of freedom\cr
\vskip 1ex

@<Probability distributions@>=
double tDistribution::p_from_k_x(double k, double x) {
    double p, q, bound;
    int which = 1, status;

    cdft(&which, &p, &q, &x, &k, &status, &bound);
    if (status != 0) {
        throw(out_of_range("tDistribution::p_from_k_x: Result out of bounds"));
    }
    return p;
}

double tDistribution::x_from_p_k(double p, double k) {
    double q, x, bound;
    int which = 2, status;

    q = 1 - p;
    cdft(&which, &p, &q, &x, &k, &status, &bound);
    if (status != 0) {
        throw(out_of_range("tDistribution::x_from_p_k: Result out of bounds"));
    }
    return x;
}

double tDistribution::k_from_p_x(double p, double x) {
    double q, k, bound;
    int which = 3, status;

    q = 1 - p;
    cdft(&which, &p, &q, &x, &k, &status, &bound);
    if (status != 0) {
        throw(out_of_range("tDistribution::k_from_p_x: Result out of bounds"));
    }
    return k;
}

@
The {\it cumulative distribution function} (CDF) $P$ value
is the probability a given value $x$ of the random
variable will occur.  It is computed by integrating the
probability density function $f(x)$ from
$-\infty$ to $x$:
$$P=\int_{-\infty}^{|x|}f(x)\,dx.$$

@<Probability distributions@>=
double tDistribution::CDF_P(double x) {
    return p_from_k_x(k, x);
}

@*1 $F$ distribution.

If $A$ and $B$ are independent random variables with
chi-square distributions with $u$ and $v$ degrees of
freedom respectively, the ratio of these variables
divided by their degrees of freedom
$${{A/u}\over{B/v}}$$ follows the
{\it F} distribution with
probability density function
$$f_x(x;u,v)={{\Gamma\Bigl({{u+v}\over 2}\Bigr)\Bigl({u\over v}\Bigr)^{u/2}x^{u/2-1}}
    \over{\Gamma\Bigl({u\over 2}\Bigr)\Gamma\Bigl({v\over 2}\Bigr)
    \Bigl[\Bigl({u\over v}\Bigr)x+1\Bigr]^{(u+v)/2}}}, \qquad 0 < x < \infty.$$

@<Class definitions@>=

class FDistribution : public probabilityDistribution {

private:@/
    virtual string distributionName(void) {
        return "F";
    }

    double u;                         // Numerator degrees of freedom
    double v;                         // Denominator degrees of freedom

public:@/
    /*  The default constructor creates an F
        distribution with $u=8$, $v=8$.  */
    FDistribution() {
        set_u_v();
    }

    FDistribution(double c_u = 8, double c_v = 8) {
        set_u_v(c_u, c_v);
    }

    //  Accessors

    void set_u(double c_u = 8) {
        u = c_u;
    }

    double get_u(void) {
        return u;
    }

    void set_v(double c_v = 8) {
        v = c_v;
    }

    double get_v(void) {
        return v;
    }

    void set_u_v(double c_u = 8, double c_v = 8) {
        set_u(c_u);
        set_v(c_v);
    }

    //  Implementations of abstract virtual functions

    double mean(void) {
        return v / (v - 2);
    }

    double stdev(void) {
        return sqrt((2 * (v * v) * (u + v - 2)) / (u * ((v - 2) * (v - 2)) * (v - 4)));
        // $\sigma=\sqrt{{2v^2(u+v-2)}\over{u(v-2)^2(v-4)}}$
    }

    double skewness(void) {
        return (2 * sqrt(2.0) * sqrt(v - 4) * (2 * u + v - 2)) /
               (sqrt(u) * (v - 6) * sqrt(u + v - 2));
        // $\gamma_1={{2\sqrt{2}\sqrt{v-4}(2u+v-2)}\over{\sqrt{u}(v-6)\sqrt{u+v-2}}}$
    }

    double kurtosisExcess(void) {
        return (12 * ((v - 4) * (v - 2) * (v - 2) +
               u * (u + v - 2) * (5 * v - 22))) /
               (u * (v - 8) * (v - 6) * (u + v - 2));
               /* $\gamma_2={{12(v-4)(v-2)^2+u(u+v-2)(5v-22)}\over
                  {u(v-8)(v-6)(u+v-2)}}$ */
    }

    //  Static transformation functions

    static double p_from_u_v_x(double u, double v, double x);
    static double x_from_p_u_v(double p, double u, double v);
    static double u_from_p_v_x(double p, double v, double x);
    static double v_from_p_u_x(double p, double u, double x);

    double CDF_P(double x);           // $P=\int_{-\infty}^{|x|}f(x)\,dx$

    //  Write description of distribution to output stream
    void writeParameters(ostream &of) {
        probabilityDistribution::writeParameters(of);
        of << "    Numerator DF = " << u << "  Denominator DF  = " << v << "\n";
    }
};

@
The following transformation functions are |static| methods of
the |FDistribution| class.  They permit calculation of
properties of the $F$ distribution without reference to a
particular |FDistribution| object.

The functions operate on the following quantities:

\vskip 1ex
\settabs 8 \columns
\+&\.{p}&$P$&Cumulative probability distribution\cr
\+&\.{x}&$x$&Value of random variable\cr
\+&\.{u}&$u$&Numerator degrees of freedom\cr
\+&\.{v}&$v$&Denominator degrees of freedom\cr
\vskip 1ex

@<Probability distributions@>=
double FDistribution::p_from_u_v_x(double u, double v, double x) {
    double p, q, bound;
    int which = 1, status;

    cdff(&which, &p, &q, &x, &u, &v, &status, &bound);
    if (status != 0) {
        throw(out_of_range("FDistribution::p_from_u_v_x: Result out of bounds"));
    }
    return p;
}

double FDistribution::x_from_p_u_v(double p, double u, double v) {
    double q, x, bound;
    int which = 2, status;

    q = 1 - p;
    cdff(&which, &p, &q, &x, &u, &v, &status, &bound);
    if (status != 0) {
        throw(out_of_range("FDistribution::x_from_p_u_v: Result out of bounds"));
    }
    return x;
}

double FDistribution::u_from_p_v_x(double p, double v, double x) {
    double q, u, bound;
    int which = 3, status;

    q = 1 - p;
    cdff(&which, &p, &q, &x, &u, &v, &status, &bound);
    if (status != 0) {
        throw(out_of_range("FDistribution::u_from_p_v_x: Result out of bounds"));
    }
    return u;
}

double FDistribution::v_from_p_u_x(double p, double u, double x) {
    double q, v, bound;
    int which = 4, status;

    q = 1 - p;
    cdff(&which, &p, &q, &x, &u, &v, &status, &bound);
    if (status != 0) {
        throw(out_of_range("FDistribution::v_from_p_u_x: Result out of bounds"));
    }
    return v;
}

@
The {\it cumulative distribution function} (CDF) $P$ value
is the probability of $n$ successes in $x$ trials.

@<Probability distributions@>=
double FDistribution::CDF_P(double x) {
    return p_from_u_v_x(u, v, x);
}

@*1 Poisson distribution.

The {\it Poisson distribution} gives the probability of
an event with an arrival rate $\lambda$ for a given interval
of occurring in an period of $x$ intervals.  The
the probability density function of the Poisson distribution is
$$f(x;\lambda)={{e^{-\lambda}\lambda^x}\over{x!}}, \qquad x=0,1,2\ldots.$$

@<Class definitions@>=

class poissonDistribution : public probabilityDistribution {

private:@/
    virtual string distributionName(void) {
        return "poisson";
    }

    double lambda;                    // Arrival rate

public:@/
    /*  The default constructor creates a poisson
        distribution with $\lambda=0.5$.  */
    poissonDistribution() {
        set_lambda();
    }

    poissonDistribution(double c_lambda = 0.5) {
        set_lambda(c_lambda);
    }

    //  Accessors

    void set_lambda(double c_lambda = 0.5) {
        lambda = c_lambda;
    }

    double get_lambda(void) {
        return lambda;
    }

    //  Implementations of abstract virtual functions

    double mean(void) {
        return lambda;
    }

    double stdev(void) {
        return sqrt(lambda);          // $\sigma=\sqrt{\lambda}$
    }

    double skewness(void) {
        return 1 / sqrt(lambda);      // $\gamma_1=1/\sqrt{\lambda}$
    }

    double kurtosisExcess(void) {
        return 1 / lambda;            // $\gamma_2=1/\lambda$
    }

    //  Static transformation functions

    static double p_from_lambda_x(double lambda, double x);
    static double x_from_p_lambda(double p, double lambda);
    static double lambda_from_p_x(double p, double x);

    //  Analysis methods

    double CDF_P(double x);           // $P=\int_{-\infty}^{|x|}f(x)\,dx$

    //  Write description of distribution to output stream
    void writeParameters(ostream &of) {
        probabilityDistribution::writeParameters(of);
        of << "    Arrival rate = " << lambda << "\n";
    }
};

@
The following transformation functions are |static| methods of
the |poissonDistribution| class.  They permit calculation of
properties of the Poisson distribution without reference to a
particular |poissonDistribution| object.

The functions operate on the following quantities:

\vskip 1ex
\settabs 8 \columns
\+&\.{p}&$P$&Cumulative probability distribution\cr
\+&\.{x}&$x$&Value of random variable\cr
\+&\.{lambda}&$\lambda$&Arrival rate\cr
\vskip 1ex

@<Probability distributions@>=
double poissonDistribution::p_from_lambda_x(double lambda, double x) {
    double p, q, bound;
    int which = 1, status;

    cdfpoi(&which, &p, &q, &x, &lambda, &status, &bound);
    if (status != 0) {
        throw(out_of_range("poissonDistribution::p_from_lambda_x: Result out of bounds"));
    }
    return p;
}

double poissonDistribution::x_from_p_lambda(double p, double lambda) {
    double q, x, bound;
    int which = 2, status;

    q = 1 - p;
    cdfpoi(&which, &p, &q, &x, &lambda, &status, &bound);
    if (status != 0) {
        throw(out_of_range("poissonDistribution::x_from_p_lambda: Result out of bounds"));
    }
    return x;
}

double poissonDistribution::lambda_from_p_x(double p, double x) {
    double q, lambda, bound;
    int which = 3, status;

    q = 1 - p;
    cdfpoi(&which, &p, &q, &x, &lambda, &status, &bound);
    if (status != 0) {
        throw(out_of_range("poissonDistribution::lambda_from_p_x: Result out of bounds"));
    }
    return lambda;
}

@
The {\it cumulative distribution function} (CDF) $P$ value
is the probability a given value $x$ of the random
variable will occur.  It is computed by integrating the
probability density function $f(x)$ from
$-\infty$ to $x$:
$$P=\int_{-\infty}^{|x|}f(x)\,dx.$$

@<Probability distributions@>=
double poissonDistribution::CDF_P(double x) {
    return p_from_lambda_x(lambda, x);
}

@*1 Binomial distribution.

An experiment consisting of $n$ independent trials, each
with probability $r$ of success, will follow a
{\it binomial distribution} with probability density function
$$f_x(x;r,n)={n\choose x}r^x(1-r)^{n-x}, \qquad n=1,2,\ldots \; x=0,1,\ldots,n$$
where the notation
$${n\choose x}={{n!}\over{x!(n-x)!}}$$
represents the number of ways of choosing $n$ items from a set of $x$.
(We use $r$ to represent the probability in the above equations
instead of the customary $p$ to avoid confusion with $P$, the
value of the cumulative distribution function.)  For an experiment
with equal probability of success or failure such as flipping
a coin, the probability $r$ is 0.5.

As the number of trials becomes large, the binomial distribution
approaches the normal distribution; in such circumstances you may
wish to use the |approximateBinomial| method of the
|normalDistribution| class.

@<Class definitions@>=

class binomialDistribution : public probabilityDistribution {

private:@/
    virtual string distributionName(void) {
        return "binomial";
    }

    double n;                         // Number of trials
    double r;                         // Probability of success in each trial

public:@/
    /*  The default constructor creates a binomial
        distribution with $n=2$, $r=0.5$.  */
    binomialDistribution() {
        set_n_r();
    }

    binomialDistribution(double c_n = 2, double c_r = 0.5) {
        set_n_r(c_n, c_r);
    }

    //  Accessors

    void set_n(double c_n = 2) {
        n = c_n;
    }

    double get_n(void) {
        return n;
    }

    void set_r(double c_r = 0.5) {
        r = c_r;
    }

    double get_r(void) {
        return r;
    }

    void set_n_r(double c_n = 2, double c_r = 0.5) {
        set_n(c_n);
        set_r(c_r);
    }

    //  Implementations of abstract virtual functions

    double mean(void) {
        return n * r;
    }

    double stdev(void) {
        return sqrt(n * r * (1 - r)); // $\sigma=\sqrt{n r (1-r)}$
    }

    double skewness(void) {
        return (1 - (2 * r)) / stdev(); // $\gamma_1={{1-2r}\over\sigma}$
    }

    double kurtosisExcess(void) {
        return (1 - (6 * (1 - r) * r)) / (n * (1 - r) * r);
               // $\gamma_2={{1-6(1-r)r}\over{n(1-r)r}}$
    }

    //  Static transformation functions

    static double p_from_n_r_s(double n, double r, double s);
    static double s_from_p_r_n(double p, double r, double n);
    static double n_from_p_r_s(double p, double r, double s);
    static double r_from_p_n_s(double p, double n, double s);

    //  Analysis methods

    @<Compute $z$ score@>;

    double CDF_P(double x);           // $P=\int_{-\infty}^{|x|}f(x)\,dx$

    //  Write description of distribution to output stream
    void writeParameters(ostream &of) {
        probabilityDistribution::writeParameters(of);
        of << "    Trials = " << n << "  Probability  = " << r << "\n";
    }
};

@
The following transformation functions are |static| methods of
the |binomialDistribution| class.  They permit calculation of
properties of the binomial distribution without reference to a
particular |binomialDistribution| object.

The functions operate on the following quantities:

\vskip 1ex
\settabs 8 \columns
\+&\.{p}&$P$&Cumulative probability distribution\cr
\+&\.{n}&$n$&Number of trials\cr
\+&\.{r}&$r$&Probability of success in each trial\cr
\+&\.{s}&$s$&Number of successes\cr
\vskip 1ex

@<Probability distributions@>=
double binomialDistribution::p_from_n_r_s(double n, double r, double s) {
    double p, q, ri, bound;
    int which = 1, status;

    ri = 1 - r;
    cdfbin(&which, &p, &q, &s, &n, &r, &ri, &status, &bound);
    if (status != 0) {
        throw(out_of_range("binomialDistribution::p_from_n_r_s: Result out of bounds"));
    }
    return p;
}

double binomialDistribution::s_from_p_r_n(double p, double r, double n) {
    double q, ri, s, bound;
    int which = 2, status;

    q = 1 - p;
    ri = 1 - r;
    cdfbin(&which, &p, &q, &s, &n, &r, &ri, &status, &bound);
    if (status != 0) {
        throw(out_of_range("binomialDistribution::s_from_p_r_n: Result out of bounds"));
    }
    return s;
}

double binomialDistribution::n_from_p_r_s(double p, double r, double s) {
    double q, n, ri, bound;
    int which = 3, status;

    q = 1 - p;
    ri = 1 - r;
    cdfbin(&which, &p, &q, &s, &n, &r, &ri, &status, &bound);
    if (status != 0) {
        throw(out_of_range("binomialDistribution::n_from_p_r_s: Result out of bounds"));
    }
    return n;
}

double binomialDistribution::r_from_p_n_s(double p, double n, double s) {
    double q, r, ri, bound;
    int which = 4, status;

    q = 1 - p;
    cdfbin(&which, &p, &q, &s, &n, &r, &ri, &status, &bound);
    if (status != 0) {
        throw(out_of_range("binomialDistribution::r_from_p_n_s: Result out of bounds"));
    }
    return r;
}

@
The {\it cumulative distribution function} (CDF) $P$ value
is the probability a given value $x$ of the random
variable will occur.  It is computed by integrating the
probability density function $f(x)$ from
$-\infty$ to $x$:
$$P=\int_{-\infty}^{|x|}f(x)\,dx.$$

@<Probability distributions@>=
double binomialDistribution::CDF_P(double x) {
    return p_from_n_r_s(n, r, x);
}

@*1 Negative binomial distribution.

The {\it negative binomial distribution} for a series of
independent trials each with a probability of success $r$
gives the probability of $n$ successes in a sequence of $x$
trials.  The probability density function is
$$f_x(x;n,r)={x-1\choose n-1}(1-r)^{x-n}r^n, \qquad n=1,2,\ldots \; x=n, n+1,\ldots.$$
(We use $r$ to represent the probability in the above equations
instead of the customary $p$ to avoid confusion with $P$, the
value of the cumulative distribution function.)  For an experiment
with equal probability of success or failure such as flipping
a coin, the probability $r$ is 0.5.  Note that the probability
is necessarily zero when $x<n$ (you can't have more successes
than trials)!

@<Class definitions@>=

class negativeBinomialDistribution : public probabilityDistribution {

private:@/
    virtual string distributionName(void) {
        return "negative binomial";
    }

    double n;                         // Number of successes
    double r;                         // Probability of success in each trial

public:@/
    /*  The default constructor creates a negative binomial
        distribution with $n=2$, $r=0.5$.  */
    negativeBinomialDistribution() {
        set_n_r();
    }

    negativeBinomialDistribution(double c_n = 2, double c_r = 0.5) {
        set_n_r(c_n, c_r);
    }

    //  Accessors

    void set_n(double c_n = 2) {
        n = c_n;
    }

    double get_n(void) {
        return n;
    }

    void set_r(double c_r = 0.5) {
        r = c_r;
    }

    double get_r(void) {
        return r;
    }

    void set_n_r(double c_n = 2, double c_r = 0.5) {
        set_n(c_n);
        set_r(c_r);
    }

    //  Implementations of abstract virtual functions

    double mean(void) {
        return (n * (1 - r)) / r;
    }

    double stdev(void) {
        return sqrt(n * (1 - r)) / r; // $\sigma={{\sqrt{n (1-r)}}\over{r}}$
    }

    double skewness(void) {
        return (2 - r) / sqrt(n * (1 - r)); // $\gamma_1={{2-r}\over\sqrt{n (1-r)}}$
    }

    double kurtosisExcess(void) {
        return (6 * (1 - r) + (r * r)) / (n * (1 - r));
               // $\gamma_2={{6(1-r)+r^2}\over{n(1-r)}}$
    }

    //  Static transformation functions

    static double p_from_n_r_s(double n, double r, double s);
    static double s_from_p_r_n(double p, double r, double n);
    static double n_from_p_r_s(double p, double r, double s);
    static double r_from_p_n_s(double p, double n, double s);

    double CDF_P(double x);           // $P=\int_{-\infty}^{|x|}f(x)\,dx$

    //  Write description of distribution to output stream
    void writeParameters(ostream &of) {
        probabilityDistribution::writeParameters(of);
        of << "    Successes = " << n << "  Probability  = " << r << "\n";
    }
};

@
The following transformation functions are |static| methods of
the |negativeBinomialDistribution| class.  They permit calculation of
properties of the negative binomial distribution without reference to a
particular |negativeBinomialDistribution| object.

The functions operate on the following quantities:

\vskip 1ex
\settabs 8 \columns
\+&\.{p}&$P$&Cumulative probability distribution\cr
\+&\.{s}&$s$&Number of successes\cr
\+&\.{r}&$r$&Probability of success in each trial\cr
\+&\.{n}&$n$&Number of trials\cr
\vskip 1ex

@<Probability distributions@>=
double negativeBinomialDistribution::p_from_n_r_s(double n, double r, double s) {
    double p, q, ri, bound;
    int which = 1, status;

    ri = 1 - r;
    cdfnbn(&which, &p, &q, &s, &n, &r, &ri, &status, &bound);
    if (status != 0) {
        throw(out_of_range("negativeBinomialDistribution::p_from_n_r_s: Result out of bounds"));
    }
    return p;
}

double negativeBinomialDistribution::s_from_p_r_n(double p, double r, double n) {
    double q, ri, s, bound;
    int which = 2, status;

    q = 1 - p;
    ri = 1 - r;
    cdfnbn(&which, &p, &q, &s, &n, &r, &ri, &status, &bound);
    if (status != 0) {
        throw(out_of_range("negativeBinomialDistribution::s_from_p_r_n: Result out of bounds"));
    }
    return s;
}

double negativeBinomialDistribution::n_from_p_r_s(double p, double r, double s) {
    double q, n, ri, bound;
    int which = 3, status;

    q = 1 - p;
    ri = 1 - r;
    cdfnbn(&which, &p, &q, &s, &n, &r, &ri, &status, &bound);
    if (status != 0) {
        throw(out_of_range("negativeBinomialDistribution::n_from_p_r_s: Result out of bounds"));
    }
    return n;
}

double negativeBinomialDistribution::r_from_p_n_s(double p, double n, double s) {
    double q, r, ri, bound;
    int which = 4, status;

    q = 1 - p;
    cdfnbn(&which, &p, &q, &s, &n, &r, &ri, &status, &bound);
    if (status != 0) {
        throw(out_of_range("negativeBinomialDistribution::r_from_p_n_s: Result out of bounds"));
    }
    return r;
}

@
The {\it cumulative distribution function} (CDF) $P$ value
is the probability of $n$ successes in $x$ trials.

@<Probability distributions@>=
double negativeBinomialDistribution::CDF_P(double x) {
    return p_from_n_r_s(n, r, x);
}

@** Descriptive statistics.

Descriptive statistics refers to empirical tests applied
to data sets, measuring quantities such as their mean
value, variance, and shape.  We implement this via the
template class |dataTable|, which is a generalisation
of the STL |vector| template with additional methods
which compute statistical metrics.  The elements in
the |dataTable| may be any type which can be converted
to a |double|.  The elements of a |dataTable| with $n$
items are referred to as $x_1,\ldots x_n$.

@<Class definitions@>=
template <class T> class dataTable : public vector<T>
{
public:@/
    double mean(void);                // Arithmetic mean value ($\mu$)
    double geometricMean(void);       // Geometric mean
    double harmonicMean(void);        // Harmonic mean
    double median(void);              // Median (central value)
    double RMS(void);                 // Root mean square
    T mode(void);                     // Mode (most frequent value)
    double percentile(double k);      // Percentile ($0\leq k\leq 1$)
    double quartile(int q)            // Quartiles (implemented with |percentile|)
    {
        assert(q >= 1 && q <= 3);
        return percentile(0.25 * q);
    }

    double variance(void);            // Variance ($\sigma^2$)
    double varianceMLE(void)          // Variance Maximum Likelihood Estimate
    {
        return (variance() * (size() - 1)) / size();
    }
    double stdev(void)                // Standard deviation ($\sigma$)
    {
        return sqrt(variance());
    }
    double stdevMLE(void)             // Standard deviation Maximum Likelihood Estimate
    {
        return sqrt(varianceMLE());
    }

    double centralMoment(int k);      // $k$th central moment
    double skewness(void);            // Skewness
    double kurtosis(void);            // Kurtosis
};

@*2 Arithmetic mean.

The arithmetic {\bf mean} is the average value of the
elements in the |dataTable|
$${1\over n}\sum_{i=1}^n x_i.$$
The result is always a |double| regardless of the data
type of the |dataTable|.

@<Class definitions@>=
template <class T> double dataTable<T>::mean(void)
{
    typename dataTable<T>::iterator p;
    double m = 0;

    for (p = begin(); p != end(); p++) {
        m += *p;
    }
    return m / size();
}

@*2 Geometric mean.

The {\bf geometric mean} of a set of $n$ values $x_i$ is
$$\prod_{i=1}^n x_i^{1\over n}.$$
The result is always a |double| regardless of the data
type of the |dataTable|.

@<Class definitions@>=
template <class T> double dataTable<T>::geometricMean(void)
{
    typename dataTable<T>::iterator p;
    double g = 1, ni = 1.0 / size();

    for (p = begin(); p != end(); p++) {
        g *= pow(*p, ni);
    }
    return g;
}

@*2 Harmonic mean.

The {\bf harmonic mean} of a set of $n$ values $x_i$ is
$${n\over{\sum_{i=1}^n {1\over x_i}}}.$$
The result is always a |double| regardless of the data
type of the |dataTable|.

@<Class definitions@>=
template <class T> double dataTable<T>::harmonicMean(void)
{
    typename dataTable<T>::iterator p;
    double d = 0;

    for (p = begin(); p != end(); p++) {
        d += 1.0 / (*p);
    }
    return size() / d;
}

@*2 Root mean square.

The {\bf root mean square} of a set of $n$ values $x_i$ is
$$\sqrt{{1\over n}\sum_{i=1}^n x_i^2}$$
The result is always a |double| regardless of the data
type of the |dataTable|.

@<Class definitions@>=
template <class T> double dataTable<T>::RMS(void)
{
    typename dataTable<T>::iterator p;
    double sum = 0;

    for (p = begin(); p != end(); p++) {
        sum += (*p) * (*p);
    }
    return sqrt(sum / size());
}

@*2 Median.

The {\bf median} or central value of a distribution is the value
for which half the elements are smaller and half are greater.
For a distribution with an even number of elements, the median
is defined as the arithmetic mean of the two elements in
the middle of the sorted distribution.  We implement the median
by using the |sort| algorithm to create an ordered copy of the
distribution, then extract the central value.
The result is always a |double| regardless of the data
type of the |dataTable|.

@<Class definitions@>=
template <class T> double dataTable<T>::median(void)
{
    dataTable<T> v(*this);

    sort(v.begin(), v.end());
    if (v.size() & 1) {
        return (double) v[(v.size() + 1) / 2];
    } else {
        return (double) ((v[(v.size() / 2) - 1] + v[v.size() / 2]) / 2.0);
    }
}

@*2 Mode.

The {\bf mode} is the value which occurs most frequently in the
collection of data.  This really only makes sense when the
|dataTable| is instantiated with an integral data type, but
we blither away regardless of the data type; the user may be
employing |double|s to represent integers too large for the
widest integral type.  The type returned is the same as the
the elements of the |dataTable|.

We compute the mode by sorting the data table in ascending
order, then scanning it linearly for repeated values, keeping
track of the item with the most repeats.  If two or more values
share the largest number of occurrences, the largest value is
returned as the mode.

@<Class definitions@>=
template <class T> T dataTable<T>::mode(void)
{
    dataTable<T> v(*this);
    T cmode, bmode = 0;
    int n = 0, most = 0;
    typename dataTable<T>::iterator p;

    sort(v.begin(), v.end());
    p = v.begin();
    cmode = *p++;
    while (p != v.end()) {
        T cval = *p++;
        if (cval == cmode) {
            n++;
        } else {
            if (n > most) {
                most = n;
                bmode = cmode;
            }
            n = 1;
            cmode = cval;
        }
    }
    if (n > most) {                   // Mode may be last item in table
        most = n;
        bmode = cmode;
    }
    return bmode;
}

@*2 Percentile.

The $k$th {\bf percentile} is the value such that $100k$\% of
the data are less than or equal to that value.  The result returned
is always a |double|.  If the result of multiplying the
size of the |dataTable| by $k$ is an integer, that item is
returned.  Otherwise, the mean of the two adjacent items is
returned.  Note that |percentile(0.5)| is a synonym for |median()|.

@<Class definitions@>=
template <class T> double dataTable<T>::percentile(double k)
{
    dataTable<T> v(*this);
    double index, result;
    int i;

    assert(k >= 0 && k <= 1);

    sort(v.begin(), v.end());

    index = v.size() * k;
    if (index != floor(index)) {
        i = ((int) index);
        result = v[i];
    } else {
        i = ((int) index) - 1;
        result = (v[i] + v[i + 1]) / 2.0;
    }
    return result;
}

@*2 Variance.

The {\bf variance} ($\sigma^2$) of a set of $n$ values $x_i$ is
$${1\over {n-1}}\sum_{i=1}^n {(x_i-\overline{x})^2}$$
where $\overline{x}$ is the |mean()| value of the
|dataTable|.
The result is always a |double| regardless of the data
type of the |dataTable|.

@<Class definitions@>=
template <class T> double dataTable<T>::variance(void)
{
    typename dataTable<T>::iterator p;
    double mu = mean(), sum = 0;

    for (p = begin(); p != end(); p++) {
        double t = ((*p) - mu);
        sum += t * t;
    }
    return sum / (size() - 1);
}

@*2 Central moments.

The $k$th {\bf central moment} of a set of $n$ values $x_i$ is
$${1\over {n}}\sum_{i=1}^n {(x_i-\overline{x})^k}$$
where $\overline{x}$ is the |mean()| value of the
|dataTable|.
The result is always a |double| regardless of the data
type of the |dataTable|.

@<Class definitions@>=
template <class T> double dataTable<T>::centralMoment(int k)
{
    typename dataTable<T>::iterator p;
    double mu = mean(), sum = 0;

    for (p = begin(); p != end(); p++) {
        sum += pow((*p) - mu, (double) k);
    }
    return sum / size();
}

@*2 Skewness.

The {\bf skewness} of a distribution is a measure of
its asymmetry with respect to the mean.  It is defined
as the third central moment of the distribution divided
by the cube of the maximum likelihood estimate of the standard deviation.
The result is always a |double| regardless of the data
type of the |dataTable|.

@<Class definitions@>=
template <class T> double dataTable<T>::skewness(void)
{
    double sigma = stdevMLE();

    return centralMoment(3) / (sigma * sigma * sigma);
}

@*2 Kurtosis.

{\bf Kurtosis} is a measure of how sharply peaked the
distribution is.
It is defined
as the fourth central moment of the distribution divided
by the square of the maximum likelihood estimate of the variance.
The result is always a |double| regardless of the data
type of the |dataTable|.

@<Class definitions@>=
template <class T> double dataTable<T>::kurtosis(void)
{
    double v = varianceMLE();

    return centralMoment(4) / (v * v);
}

@** Test program.

@(statlib_test.cc@>=

@<Test program include files@>;

@<Show how to call test program@>;

int main(int argc, char *argv[])
{
    extern char *optarg;            /* Imported from |getopt| */
//  extern int optind;

    int opt;

@<Process command-line options@>;@/

    @<Test descriptive statistics template@>;

#if 1

    //  Statistical library tests

    {
        normalDistribution nd(100, 7.07);

        cout << "Normal dist: mu = " << nd.get_mu() << " sigma = " <<
                nd.get_sigma() << " variance = " << nd.variance() << "\n";
        cout << "  P = " << nd.CDF_P(110) << " Q = " << nd.CDF_Q(110) << "\n";
        nd.writeParameters(cout);

        nd.approximateBinomial(200);
        cout << "Normal dist: mu = " << nd.get_mu() << " sigma = " <<
                nd.get_sigma() << " variance = " << nd.variance() << "\n";
        cout << "  P = " << nd.CDF_P(110) << " Q = " << nd.CDF_Q(110) <<
                "  z = " << nd.z_score(110) << "\n";
    }

    {
        poissonDistribution pd(10);

        pd.writeParameters(cout);
        cout << "Poisson dist: lambda = " << pd.get_lambda() << "\n";
        cout << "  P = " << pd.CDF_P(5) << " Q = " << pd.CDF_Q(5) << "\n";
    }

    {
        chiSquareDistribution xd(32);

        xd.writeParameters(cout);
        cout << "Chi-square dist: k = " << xd.get_k() << "\n";
        cout << "  P = " << xd.CDF_P(36) << " Q = " << xd.CDF_Q(36) << "\n";
    }

    {
        gammaDistribution gd(7.5, 3.75);

        gd.writeParameters(cout);
        cout << "Gamma dist: alpha = " << gd.get_alpha() <<
                "  lambda = " << gd.get_lambda() << "\n";
        cout << "  P = " << gd.CDF_P(2.2) << " Q = " << gd.CDF_Q(2.2) << "\n";
    }

    {
        betaDistribution bd(3, 4);

        bd.writeParameters(cout);
        cout << "Beta dist: a = " << bd.get_a() <<
                "  b = " << bd.get_b() << "\n";
        cout << "  P = " << bd.CDF_P(0.4) << " Q = " << bd.CDF_Q(0.4) << "\n";
    }

    {
        tDistribution xd(32);

        xd.writeParameters(cout);
        cout << "t dist: k = " << xd.get_k() << "\n";
        cout << "  P = " << xd.CDF_P(0.8) << " Q = " << xd.CDF_Q(0.8) << "\n";
    }

    {
        FDistribution fd(12, 16);

        fd.writeParameters(cout);
        cout << "F dist: u = " << fd.get_u() << "  v = " << fd.get_v() << "\n";
        cout << "  P = " << fd.CDF_P(0.8) << " Q = " << fd.CDF_Q(0.8) << "\n";
    }

    {
        binomialDistribution bd(200, 0.5);

        cout << "Binomial dist: n = " << bd.get_n() << " r = " <<
                bd.get_r() << " variance = " << bd.variance() << "\n";
        cout << "  P = " << bd.CDF_P(110) << " Q = " << bd.CDF_Q(110) <<
                "  z = " << bd.z_score(110) << "\n";
        bd.writeParameters(cout);
    }

    {
        negativeBinomialDistribution bd(10, 0.5);

        cout << "Negative binomial dist: n = " << bd.get_n() << " r = " <<
                bd.get_r() << " variance = " << bd.variance() << "\n";
        cout << "  P = " << bd.CDF_P(15) << " Q = " << bd.CDF_Q(15) << "\n";
        bd.writeParameters(cout);
    }
#endif

    return 0;
}

@
We use |getopt| to process command line options.  This
permits aggregation of options without arguments and
both \.{-d}{\it arg} and \.{-d} {\it arg} syntax.
@<Process command-line options@>=
    while ((opt = getopt(argc, argv, "nu-:")) != -1) {
        switch (opt) {

            case 'u':             /* \.{-u}  Print how-to-call information */
            case '?':
                usage();
                return 0;

            case '-':             /* \.{--}  Extended options */
                switch (optarg[0]) {
                    case 'c':     /* \.{--copyright} */
                        cout << "This program is in the public domain.\n";
                        return 0;

                    case 'h':     /* \.{--help} */
                        usage();
                        return 0;

                    case 'v':     /* \.{--version} */
                        cout << PRODUCT << " " << VERSION << "\n";
                        cout << "Last revised: " << REVDATE << "\n";
                        cout << "The latest version is always available\n";
                        cout << "at http://www.fourmilab.ch/eggtools/eggshell\n";
                        return 0;
                }
        }
    }

@ Procedure |usage|
prints how-to-call information.

@<Show how to call test program@>=
static void usage(void)
{
    cout << PRODUCT << "  --  Analyse eggsummary files.  Call:\n";
    cout << "              " << PRODUCT << " [options] [infile] [outfile]\n";
    cout << "\n";
    cout << "Options:\n";
    cout << "           --copyright       Print copyright information\n";
    cout << "           -u, --help        Print this message\n";
    cout << "           --version         Print version number\n";
    cout << "\n";
    cout << "by John Walker\n";
    cout << "http://www.fourmilab.ch/\n";
}

@
The following test the descriptive statistics facilities
implemented by the |dataTable| template class.

@<Test descriptive statistics template@>=
    {
        int i;
        dataTable<int> zot;
        dataTable<int>::iterator z;

#define PR for (z = zot.begin(); z != zot.end(); z++) { cout << *z << " "; } cout << "\n";

        for (i = 0; i < 10; i++) {
            zot.push_back(rand() & 0x3F);
        }
        PR;

        sort(zot.begin(), zot.end());
        PR;
        reverse(zot.begin(), zot.end());
        PR;

        cout << "Mean = " << zot.mean() << "\n";
        cout << "Geometric mean = " << zot.geometricMean() << "\n";
        cout << "Harmonic mean = " << zot.harmonicMean() << "\n";
        cout << "RMS = " << zot.RMS() << "\n";
        cout << "Median = " << zot.median() << "\n";
        cout << "Mode = " << zot.mode() << "\n";
        cout << "Percentile(0.5) = " << zot.percentile(0.5) << "\n";
        cout << "Quartile(1) = " << zot.quartile(1) << "\n";
        cout << "Quartile(3) = " << zot.quartile(3) << "\n";

        cout << "Variance = " << zot.variance() << "\n";
        cout << "Standard deviation = " << zot.stdev() << "\n";
        cout << "CentralMoment(3) = " << zot.centralMoment(3) << "\n";
        cout << "Skewness = " << zot.skewness() << "\n";
        cout << "Kurtosis = " << zot.kurtosis() << "\n";
    }

@
We need the following definitions to compile the test program.

@<Test program include files@>=

#include "config.h"         // Our configuration

/* C++ include files  */
#include <iostream>
#include <exception>
#include <stdexcept>
#include <string>
using namespace std;

#include <stdio.h>
#include <stdlib.h>
#include <ctype.h>
#ifdef HAVE_GETOPT
#ifdef HAVE_UNISTD_H
#include <unistd.h>
#endif
#else
#include "getopt.h"     /* No system \.{getopt}---use our own */
#endif
#include "statlib.h"        // Class definitions for this package


@** Index.
The following is a cross-reference table for \.{statlib}.
Single-character identifiers are not indexed, nor are
reserved words.  Underlined entries indicate where
an identifier was declared.
