 Statistics - Maple Programming Help

Home : Support : Online Help : Statistics and Data Analysis : Statistics Package : Quantities : Statistics/Variance

Statistics

 Variance
 compute the variance

 Calling Sequence Variance(A, ds_options) Variance(X, rv_options)

Parameters

 A - X - algebraic; random variable or distribution ds_options - (optional) equation(s) of the form option=value where option is one of ignore, or weights; specify options for computing the variance of a data set rv_options - (optional) equation of the form numeric=value; specifies options for computing the variance of a random variable

Description

 • The Variance function computes the sample variance of the specified data set or random variable. In the data set case the following (unbiased) estimate for the variance is used:

$\frac{\sum _{i=1}^{N}{\left({A}_{i}-\mathrm{Mean}\left(A\right)\right)}^{2}}{N-1}$

 where N is the number of elements per data set A.
 • The first parameter can be a data set, a distribution (see Statistics[Distribution]), a random variable, or an algebraic expression involving random variables (see Statistics[RandomVariable]).

Computation

 • By default, all computations involving random variables are performed symbolically (see option numeric below).
 • All computations involving data are performed in floating-point; therefore, all data provided must have type/realcons and all returned solutions are floating-point, even if the problem is specified with exact values.

Data Set Options

 The ds_options argument can contain one or more of the options shown below. More information for some options is available in the Statistics[DescriptiveStatistics] help page.
 • ignore=truefalse -- This option controls how missing data is handled by the Variance command. Missing items are represented by undefined or Float(undefined). So, if ignore=false and A contains missing data, the Variance command will return undefined. If ignore=true all missing items in A will be ignored. The default value is false.
 • weights=Vector -- Data weights. The number of elements in the weights array must be equal to the number of elements in the original data sample. By default all elements in A are assigned weight $1$.

Random Variable Options

 The rv_options argument can contain one or more of the options shown below. More information for some options is available in the Statistics[RandomVariables] help page.
 • numeric=truefalse -- By default, the variance is computed using exact arithmetic. To compute the variance numerically, specify the numeric or numeric = true option.

Examples

 > $\mathrm{with}\left(\mathrm{Statistics}\right):$

Compute the variance of the beta distribution with parameters $p$ and $q$.

 > $\mathrm{Variance}\left('\mathrm{Β}'\left(p,q\right)\right)$
 $\frac{{p}{}{q}}{{\left({p}{+}{q}\right)}^{{2}}{}\left({p}{+}{q}{+}{1}\right)}$ (1)

Use numeric parameters.

 > $\mathrm{Variance}\left('\mathrm{Β}'\left(3,5\right)\right)$
 $\frac{{5}}{{192}}$ (2)
 > $\mathrm{Variance}\left('\mathrm{Β}'\left(3,5\right),\mathrm{numeric}\right)$
 ${0.02604166667}$ (3)

Generate a random sample of size 100000 drawn from the above distribution and compute the sample variance.

 > $A≔\mathrm{Sample}\left('\mathrm{Β}'\left(3,5\right),{10}^{5}\right):$
 > $\mathrm{Variance}\left(A\right)$
 ${0.0262326253685661}$ (4)

Compute the standard error of the sample variance for the normal distribution with parameters 5 and 2.

 > $X≔\mathrm{RandomVariable}\left(\mathrm{Normal}\left(5,2\right)\right):$
 > $B≔\mathrm{Sample}\left(X,{10}^{6}\right):$
 > $\left[\mathrm{Variance}\left(X\right),\mathrm{StandardError}\left[{10}^{6}\right]\left(\mathrm{Variance},X\right)\right]$
 $\left[{4}{,}\frac{\sqrt{{2}}}{{250}}\right]$ (5)
 > $\mathrm{Variance}\left(B\right)$
 ${3.99901171779336}$ (6)

Create a beta-distributed random variable $Y$ and compute the variance of $\frac{1}{Y+2}$.

 > $Y≔\mathrm{RandomVariable}\left('\mathrm{Β}'\left(5,2\right)\right):$
 > $\mathrm{Variance}\left(\frac{1}{Y+2}\right)$
 ${-}\frac{{1356439}}{{4}}{+}{4147200}{}{\mathrm{ln}}{}\left({3}\right){}{\mathrm{ln}}{}\left({2}\right){-}{2073600}{}{{\mathrm{ln}}{}\left({2}\right)}^{{2}}{-}{1677120}{}{\mathrm{ln}}{}\left({2}\right){-}{2073600}{}{{\mathrm{ln}}{}\left({3}\right)}^{{2}}{+}{1677120}{}{\mathrm{ln}}{}\left({3}\right)$ (7)
 > $\mathrm{Variance}\left(\frac{1}{Y+2},\mathrm{numeric}\right)$
 ${0.0005174968134}$ (8)

Verify this using simulation.

 > $C≔\mathrm{Sample}\left(\frac{1}{Y+2},{10}^{5}\right):$
 > $\mathrm{Variance}\left(C\right)$
 ${0.000518013288062160}$ (9)

Compute the variance of a weighted data set.

 > $V≔⟨\mathrm{seq}\left(i,i=57..77\right),\mathrm{undefined}⟩:$
 > $W≔⟨2,4,14,41,83,169,394,669,990,1223,1329,1230,1063,646,392,202,79,32,16,5,2,5⟩:$
 > $\mathrm{Variance}\left(V,\mathrm{weights}=W\right)$
 ${Float}{}\left({\mathrm{undefined}}\right)$ (10)
 > $\mathrm{Variance}\left(V,\mathrm{weights}=W,\mathrm{ignore}=\mathrm{true}\right)$
 ${7.43882748489699}$ (11)

Consider the following Matrix data set.

 > $M≔\mathrm{Matrix}\left(\left[\left[3,1130,114694\right],\left[4,1527,127368\right],\left[3,907,88464\right],\left[2,878,96484\right],\left[4,995,128007\right]\right]\right)$
 ${M}{≔}\left[\begin{array}{ccc}{3}& {1130}& {114694}\\ {4}& {1527}& {127368}\\ {3}& {907}& {88464}\\ {2}& {878}& {96484}\\ {4}& {995}& {128007}\end{array}\right]$ (12)

We compute the variance of each of the columns.

 > $\mathrm{Variance}\left(M\right)$
 $\left[\begin{array}{ccc}{0.700000000000000}& {69998.3000000000}& {3.22345150800000}{}{{10}}^{{8}}\end{array}\right]$ (13)

References

 Stuart, Alan, and Ord, Keith. Kendall's Advanced Theory of Statistics. 6th ed. London: Edward Arnold, 1998. Vol. 1: Distribution Theory.

Compatibility

 • The A parameter was updated in Maple 16.