Angara.Statistics


Supported probability distributions

Continuous random variables

We will use the following code to present a PDF and a histogram of distributions supported by the package. This compares output of draw and log_pdf functions for the same distribution.

 1: 
 2: 
 3: 
 4: 
 5: 
 6: 
 7: 
 8: 
 9: 
10: 
11: 
12: 
13: 
14: 
open Angara.Statistics
open Angara.Charting

let chart d xmin xmax =
    let n, k, mt = 128, 131072, MT19937()
    let h = Seq.init k (fun _ -> draw mt d) |> histogram_ n xmin xmax
    let xh = [|for i in 0..2*n+1 ->
                xmin + float (i/2) * (xmax - xmin) / float n|]
    let yh = [|for i in 0..2*n+1 ->
                if i=0 || i=2*n+1 then 0. else float n / (xmax - xmin) * float(h.[(i-1)/2]) / float k |]
    let xpdf = [|for i in 0..n -> xmin + float i * (xmax - xmin) / float n|]
    let ypdf = Array.map (fun x -> exp(log_pdf d x)) xpdf
    Chart.ofList [Plot.line(xpdf, ypdf, thickness=7., stroke="lightgray")
                  Plot.line(xh, yh)]

Uniform distribution

Uniform(lower_bound, upper_bound) signifies a uniform probability distribution on [lower_bound, upper_bound) interval. Upper bound must be greater than lower bound.

1: 
2: 
3: 
let chart_uniform =
    let d = Uniform(-1.,2.)
    chart d -1.5 2.5

Log-uniform distribution

Uniform(lower_bound, upper_bound). A uniform distribution of log(x) is a non-uniform for x. Both bounds must be greater than zero.

1: 
2: 
3: 
let chart_loguniform =
    let d = LogUniform(0.5,1.5)
    chart d 0. 2.

Linear distribution

Linear(lower_bound, upper_bound, density_at_lower_bound). Density function of this distribution is linear on a [lower_bound, upper_bound) range and is 'improbable' outside of it. Normalization condition gives us density value at upper bound: density_at_upper_bound = density_at_lower_bound + 2/(upper_bound - lower_bound). Density must be positive on both sides which restricts their possible values. If density_at_lower_band is outside of the permissible range, it is brought to the nearest permissible value.

This distribution is useful as a component of Mixture (see below).

1: 
2: 
3: 
let chart_linear =
    let d = Linear(-1., 2., 0.2)
    chart d -1.5 2.5

Normal distribution

Normal(mean, standard_deviation). This is often used for real-valued random variables which exact distributions are not known.

1: 
2: 
3: 
let chart_normal =
    let d = Normal(37., 9.)
    chart d 10. 64.

Log-normal distribution

LogNormal(mean, standard_deviation_log). A normal distribution of log(x). The second parameter is a standard deviation of log(x), not a standard deviation of x.

1: 
2: 
3: 
let chart_lognormal =
    let d = LogNormal(37., 0.3)
    chart d 10. 90.

Exponential distribution

Exponential(mean) describes the time between events in a process in which events occur continuously and independently at a constant average rate. The only parameter must be greater than zero. Values of an exponentially distributed random variable are always greater than zero.

1: 
2: 
3: 
let chart_exponential =
    let d = Exponential(5.7)
    chart d 0. 20.

Gamma distribution

Gamma(alpha, beta) - a family of distributions of positive values. The parameters alpha and beta are sometimes called shape and rate. Both must be greater than zero. Gamma(1, 1/lambda) === Exponential(lambda). A special case of Gamma Γ(k/2, 1/2) is a chi-squared distribution χ²(k).

1: 
2: 
3: 
let chart_gamma =
    let d = Gamma(3.0, 0.5)
    chart d 0. 20.

Descrete distributions

For these distributions draw function always returns non-negative random values with zero fraction part. Probability distribution function log_pdf truncates franction part of its argument. The only exception is Bernoulli distribution in which case log_pdf treats all values of its argument x > 0.5 as 1, and the rest argument valus are treated as 0.

The following code presents mass distribution and a histogram of discrete probability distributions supported by the package. This compares output of draw and log_pdf functions for the same distribution.

 1: 
 2: 
 3: 
 4: 
 5: 
 6: 
 7: 
 8: 
 9: 
10: 
11: 
12: 
13: 
14: 
15: 
16: 
let discrete_chart d =
    let n, k, mt = 20, 20480, MT19937()
    let h = Array.create (n+1) 0
    seq {1..k} |> Seq.iter (fun _ ->
        let i = int(draw mt d) in if i<=n then h.[i] <- h.[i]+1)
    let xh = [|for i in 0..3*n+2 ->
                if i%3=2 then nan 
                else float(i/3)|]
    let yh = [|for i in 0..3*n+2 ->
                if i%3=0 then 0. 
                elif i%3=1 then float(h.[i/3])/float k 
                else nan |]
    let ypdf = Array.mapi (fun i x ->
        if i%3=0 then 0.0 else if i%3=2 then nan else exp(log_pdf d x)) xh
    Chart.ofList [Plot.line(xh, ypdf, thickness=7., stroke="lightgray")
                  Plot.line(xh, yh)]

Bernoulli

Bernoulli(mean) denotes distribution of a yes/no experiment (1 or 0) which yields success with probability mean. Note that log_pdf function returns the same value for all 'x > 0.5';

1: 
2: 
3: 
let chart_bernoulli =
    let d = Bernoulli(0.7)
    discrete_chart d

Binomial

Binomial(n, p) is a number of successes in a sequence of n independent yes/no experiments, each of which yields success with probability p.

1: 
2: 
3: 
let chart_binomial =
    let d = Binomial(20, 0.7)
    discrete_chart d

Negative binomial distribution

NegativeBinomial(mean, r) is a number of successes before a given number of failures r in a sequence of yes/no experiments, each of which yields success with probability p = mean/(mean+r).

1: 
2: 
3: 
let chart_negative_binomial =
    let d = NegativeBinomial(5.7, 7.5)
    discrete_chart d

Poisson

Poisson(mean) is a number of events occuring in a fixed interval of time if these events occur with a known average rate = mean.

1: 
2: 
3: 
let chart_poisson =
    let d = Poisson(5.7)
    discrete_chart d

Mixture

Mixture([w1,d1; w2,d2; ...]), where w1, w2, ... - weights (real numbers) and d1, d2, ... - distributions. The list must not be empty and sum of the weights must be equal to one.

1: 
2: 
3: 
let chart_mixture =
    let d = Mixture([0.9,Normal(37.,9.); 0.1,Uniform(15.,20.)])
    chart d 10. 64.
namespace Angara
module Statistics

from Angara
namespace Angara.Charting
val chart : d:Distribution -> xmin:float -> xmax:float -> Chart

Full name: Distributions.chart
val d : Distribution
val xmin : float
val xmax : float
val n : int
val k : int
val mt : MT19937
Multiple items
type MT19937 =
  new : copy:MT19937 -> MT19937
  new : seed:uint32 [] -> MT19937
  new : ?seed:uint32 -> MT19937
  private new : mt:uint32 [] * idx:int -> MT19937
  member bernoulli : p:float -> bool
  member private getIdx : int
  member private getMt : uint32 []
  member get_seed : unit -> uint32 []
  member normal : unit -> float
  member uniform_float64 : unit -> float
  ...

Full name: Angara.Statistics.MT19937

--------------------
new : ?seed:uint32 -> MT19937
new : seed:uint32 [] -> MT19937
new : copy:MT19937 -> MT19937
val h : int []
module Seq

from Microsoft.FSharp.Collections
val init : count:int -> initializer:(int -> 'T) -> seq<'T>

Full name: Microsoft.FSharp.Collections.Seq.init
val draw : gen:MT19937 -> d:Distribution -> float

Full name: Angara.Statistics.draw
val histogram_ : n:int -> xmin:float -> xmax:float -> xs:seq<float> -> int []

Full name: Angara.Statistics.histogram_
val xh : float []
val i : int
Multiple items
val float : value:'T -> float (requires member op_Explicit)

Full name: Microsoft.FSharp.Core.Operators.float

--------------------
type float = System.Double

Full name: Microsoft.FSharp.Core.float

--------------------
type float<'Measure> = float

Full name: Microsoft.FSharp.Core.float<_>
val yh : float []
val xpdf : float []
val ypdf : float []
module Array

from Microsoft.FSharp.Collections
val map : mapping:('T -> 'U) -> array:'T [] -> 'U []

Full name: Microsoft.FSharp.Collections.Array.map
val x : float
val exp : value:'T -> 'T (requires member Exp)

Full name: Microsoft.FSharp.Core.Operators.exp
val log_pdf : d:Distribution -> v:float -> float

Full name: Angara.Statistics.log_pdf
type Chart =
  {Plots: PlotInfo list;}
  static member ofList : plots:PlotInfo list -> Chart

Full name: Angara.Charting.Chart
static member Chart.ofList : plots:PlotInfo list -> Chart
type Plot =
  private new : unit -> Plot
  static member band : seriesX:float [] * seriesY1:float [] * seriesY2:float [] * ?fill:string * ?displayName:string * ?titles:BandTitles -> PlotInfo
  static member heatmap : x:float [] * y:float [] * values:float [] * ?colorPalette:string * ?treatAs:HeatmapTreatAs * ?displayName:string * ?titles:HeatmapTitles -> PlotInfo
  static member heatmap : x:float [] * y:float [] * values:HeatmapValues * ?colorPalette:string * ?treatAs:HeatmapTreatAs * ?displayName:string * ?titles:HeatmapTitles -> PlotInfo
  static member line : seriesY:float [] * ?stroke:string * ?thickness:float * ?treatAs:LineTreatAs * ?fill68:string * ?fill95:string * ?displayName:string * ?titles:LineTitles -> PlotInfo
  static member line : seriesX:float [] * seriesY:float [] * ?stroke:string * ?thickness:float * ?treatAs:LineTreatAs * ?fill68:string * ?fill95:string * ?displayName:string * ?titles:LineTitles -> PlotInfo
  static member line : x:LineX * y:LineY * ?stroke:string * ?thickness:float * ?treatAs:LineTreatAs * ?fill68:string * ?fill95:string * ?displayName:string * ?titles:LineTitles -> PlotInfo
  static member markers : seriesX:float [] * seriesY:float [] * ?color:MarkersColor * ?colorPalette:string * ?size:MarkersSize * ?sizeRange:MarkersSizeRange * ?shape:MarkersShape * ?borderColor:string * ?displayName:string * ?titles:MarkersTitles -> PlotInfo
  static member markers : x:MarkersX * y:MarkersY * ?color:MarkersColor * ?colorPalette:string * ?size:MarkersSize * ?sizeRange:MarkersSizeRange * ?shape:MarkersShape * ?borderColor:string * ?displayName:string * ?titles:MarkersTitles -> PlotInfo

Full name: Angara.Charting.Plot
static member Plot.line : seriesY:float [] * ?stroke:string * ?thickness:float * ?treatAs:LineTreatAs * ?fill68:string * ?fill95:string * ?displayName:string * ?titles:LineTitles -> PlotInfo
static member Plot.line : seriesX:float [] * seriesY:float [] * ?stroke:string * ?thickness:float * ?treatAs:LineTreatAs * ?fill68:string * ?fill95:string * ?displayName:string * ?titles:LineTitles -> PlotInfo
static member Plot.line : x:LineX * y:LineY * ?stroke:string * ?thickness:float * ?treatAs:LineTreatAs * ?fill68:string * ?fill95:string * ?displayName:string * ?titles:LineTitles -> PlotInfo
val chart_uniform : Chart

Full name: Distributions.chart_uniform
union case Distribution.Uniform: float * float -> Distribution
val chart_loguniform : Chart

Full name: Distributions.chart_loguniform
union case Distribution.LogUniform: float * float -> Distribution
val chart_linear : Chart

Full name: Distributions.chart_linear
union case Distribution.Linear: lower_bound: float * upper_bound: float * density_at_lower_bound: float -> Distribution
val chart_normal : Chart

Full name: Distributions.chart_normal
union case Distribution.Normal: float * float -> Distribution
val chart_lognormal : Chart

Full name: Distributions.chart_lognormal
union case Distribution.LogNormal: float * float -> Distribution
val chart_exponential : Chart

Full name: Distributions.chart_exponential
union case Distribution.Exponential: mean: float -> Distribution
val chart_gamma : Chart

Full name: Distributions.chart_gamma
union case Distribution.Gamma: float * float -> Distribution
val discrete_chart : d:Distribution -> Chart

Full name: Distributions.discrete_chart
val create : count:int -> value:'T -> 'T []

Full name: Microsoft.FSharp.Collections.Array.create
Multiple items
val seq : sequence:seq<'T> -> seq<'T>

Full name: Microsoft.FSharp.Core.Operators.seq

--------------------
type seq<'T> = System.Collections.Generic.IEnumerable<'T>

Full name: Microsoft.FSharp.Collections.seq<_>
val iter : action:('T -> unit) -> source:seq<'T> -> unit

Full name: Microsoft.FSharp.Collections.Seq.iter
Multiple items
val int : value:'T -> int (requires member op_Explicit)

Full name: Microsoft.FSharp.Core.Operators.int

--------------------
type int = int32

Full name: Microsoft.FSharp.Core.int

--------------------
type int<'Measure> = int

Full name: Microsoft.FSharp.Core.int<_>
val nan : float

Full name: Microsoft.FSharp.Core.Operators.nan
val mapi : mapping:(int -> 'T -> 'U) -> array:'T [] -> 'U []

Full name: Microsoft.FSharp.Collections.Array.mapi
val chart_bernoulli : Chart

Full name: Distributions.chart_bernoulli
union case Distribution.Bernoulli: float -> Distribution
val chart_binomial : Chart

Full name: Distributions.chart_binomial
union case Distribution.Binomial: int * float -> Distribution
val chart_negative_binomial : Chart

Full name: Distributions.chart_negative_binomial
union case Distribution.NegativeBinomial: mean: float * r: float -> Distribution
val chart_poisson : Chart

Full name: Distributions.chart_poisson
union case Distribution.Poisson: mean: float -> Distribution
val chart_mixture : Chart

Full name: Distributions.chart_mixture
union case Distribution.Mixture: (float * Distribution) list -> Distribution
Fork me on GitHub