

A Distribution is a value which allows to draw a pseudo-random number or compute a probability density function log_pdf. The Distribution discriminated union supports most common probability distributions like Uniform or Normal. More details are in a separate document page.

For example, here is how you define normal distribution once you've referenced Angara.Statistics.dll:

open Angara.Statistics

let distribution = Normal(37.0, 9.0)

Random number generator

To draw random numbers we use Mersenne Twister, one of the most commonly used random number generator.

let generator = MT19937()
let random_values =
    generator.uniform_uint32(),  // low-level output
    generator.uniform_float64(), // a floating point number from [0,1)
    generator.normal()           // standard normal distribution with mean=0 and stdev=1
(3499211612u, 0.1354770041, -0.5456602766)

To reproduce the generator state use either exact seed or copy-constructor:

let seed = generator.get_seed()
let g' = MT19937(seed)
let g'' = MT19937(generator)
// the three instances should now produce
// exactly the same sequence of pseudo-random numbers
let test_3 = generator.normal(), g'.normal(), g''.normal()
(-2.165115627, -2.165115627, -2.165115627)

To draw a number from a non-standard distribution use draw : MT19337 -> Distribution -> float. For example, here we generate an array of 1000 random numbers from the above distribution.

let samples = [| for _ in 1..1000 -> draw generator distribution |]

Probability density function

One single function log_pdf : Distribution -> float -> float computes logarithm of normalized probability density of any distribution at any valid point.

If you do not have a Distribution but have a sample from the distribution used kernel density estimator kde : int -> float seq -> (float[] * float[]). The estimator needs target number of points at which the density is to be evaluated. If this number is not a power of two he number of points will be the next larger power of two. The estimator returns two arrays of x and y values:

The following sample code compares non-parametric density of the sample with the exact density from log_pdf function.

// estimate density curve in 16 points
let sampling_density_x, sampling_density_y = kde 128 samples
// compute exact probability density function
let analytic_density_y = [|for x in sampling_density_x -> exp(log_pdf distribution x)|]

open Angara.Charting
let chart = 
    Chart.ofList [Plot.line(sampling_density_x, analytic_density_y, thickness=7., stroke="lightgray")
                  Plot.line(sampling_density_x, sampling_density_y)]

Sample statistics

Use summary and qsummary functions to quickly compute mean, standard deviation, 95% and 68% quantiles of a sample:

printfn "%A" (summary samples)
{count = 1000;
 min = 9.02008448;
 max = 67.19459218;
 mean = 37.14566509;
 variance = 80.56673735;}
printfn "%A" (qsummary samples)
{min = 9.02008448;
 lb95 = 19.81095462;
 lb68 = 28.21379981;
 median = 37.17099114;
 ub68 = 45.98669919;
 ub95 = 55.18024092;
 max = 67.19459218;}

The lb95 in the qsummary record stands for "lower bound of 95% interval" with is actually a 2.5% quantile. Similarly ub95 stands for 97.5% quantile.

