Sophie

Sophie

distrib > Mandriva > 2010.0 > i586 > media > contrib-release > by-pkgid > 3e60ff9d4d6f58c8fbd17208f14089fa > files > 149

octave-doc-3.2.3-3mdv2010.0.i586.rpm

<html lang="en">
<head>
<title>Descriptive Statistics - Untitled</title>
<meta http-equiv="Content-Type" content="text/html">
<meta name="description" content="Untitled">
<meta name="generator" content="makeinfo 4.13">
<link title="Top" rel="start" href="index.html#Top">
<link rel="up" href="Statistics.html#Statistics" title="Statistics">
<link rel="next" href="Basic-Statistical-Functions.html#Basic-Statistical-Functions" title="Basic Statistical Functions">
<link href="http://www.gnu.org/software/texinfo/" rel="generator-home" title="Texinfo Homepage">
<meta http-equiv="Content-Style-Type" content="text/css">
<style type="text/css"><!--
  pre.display { font-family:inherit }
  pre.format  { font-family:inherit }
  pre.smalldisplay { font-family:inherit; font-size:smaller }
  pre.smallformat  { font-family:inherit; font-size:smaller }
  pre.smallexample { font-size:smaller }
  pre.smalllisp    { font-size:smaller }
  span.sc    { font-variant:small-caps }
  span.roman { font-family:serif; font-weight:normal; } 
  span.sansserif { font-family:sans-serif; font-weight:normal; } 
--></style>
</head>
<body>
<div class="node">
<a name="Descriptive-Statistics"></a>
<p>
Next:&nbsp;<a rel="next" accesskey="n" href="Basic-Statistical-Functions.html#Basic-Statistical-Functions">Basic Statistical Functions</a>,
Up:&nbsp;<a rel="up" accesskey="u" href="Statistics.html#Statistics">Statistics</a>
<hr>
</div>

<h3 class="section">25.1 Descriptive Statistics</h3>

<p>Octave can compute various statistics such as the moments of a data set.

<!-- ./statistics/base/mean.m -->
   <p><a name="doc_002dmean"></a>

<div class="defun">
&mdash; Function File:  <b>mean</b> (<var>x, dim, opt</var>)<var><a name="index-mean-1819"></a></var><br>
<blockquote><p>If <var>x</var> is a vector, compute the mean of the elements of <var>x</var>

     <pre class="example">          mean (x) = SUM_i x(i) / N
</pre>
        <p>If <var>x</var> is a matrix, compute the mean for each column and return them
in a row vector.

        <p>With the optional argument <var>opt</var>, the kind of mean computed can be
selected.  The following options are recognized:

          <dl>
<dt><code>"a"</code><dd>Compute the (ordinary) arithmetic mean.  This is the default.

          <br><dt><code>"g"</code><dd>Compute the geometric mean.

          <br><dt><code>"h"</code><dd>Compute the harmonic mean. 
</dl>

        <p>If the optional argument <var>dim</var> is supplied, work along dimension
<var>dim</var>.

        <p>Both <var>dim</var> and <var>opt</var> are optional.  If both are supplied,
either may appear first. 
</p></blockquote></div>

<!-- ./statistics/base/median.m -->
   <p><a name="doc_002dmedian"></a>

<div class="defun">
&mdash; Function File:  <b>median</b> (<var>x, dim</var>)<var><a name="index-median-1820"></a></var><br>
<blockquote><p>If <var>x</var> is a vector, compute the median value of the elements of
<var>x</var>.  If the elements of <var>x</var> are sorted, the median is defined
as

     <pre class="example">                      x(ceil(N/2)),             N odd
          median(x) =
                      (x(N/2) + x((N/2)+1))/2,  N even
</pre>
        <p>If <var>x</var> is a matrix, compute the median value for each
column and return them in a row vector.  If the optional <var>dim</var>
argument is given, operate along this dimension. 
<!-- Texinfo @sp should work but in practice produces ugly results for HTML. -->
<!-- A simple blank line produces the correct behavior. -->
<!-- @sp 1 -->

     <p class="noindent"><strong>See also:</strong> <a href="doc_002dstd.html#doc_002dstd">std</a>, <a href="doc_002dmean.html#doc_002dmean">mean</a>. 
</p></blockquote></div>

<!-- ./statistics/base/quantile.m -->
   <p><a name="doc_002dquantile"></a>

<div class="defun">
&mdash; Function File: <var>q</var> = <b>quantile</b> (<var>x, p</var>)<var><a name="index-quantile-1821"></a></var><br>
&mdash; Function File: <var>q</var> = <b>quantile</b> (<var>x, p, dim</var>)<var><a name="index-quantile-1822"></a></var><br>
&mdash; Function File: <var>q</var> = <b>quantile</b> (<var>x, p, dim, method</var>)<var><a name="index-quantile-1823"></a></var><br>
<blockquote><p>For a sample, <var>x</var>, calculate the quantiles, <var>q</var>, corresponding to
the cumulative probability values in <var>p</var>.  All non-numeric values (NaNs) of
<var>x</var> are ignored.

        <p>If <var>x</var> is a matrix, compute the quantiles for each column and
return them in a matrix, such that the i-th row of <var>q</var> contains
the <var>p</var>(i)th quantiles of each column of <var>x</var>.

        <p>The optional argument <var>dim</var> determines the dimension along which
the percentiles are calculated.  If <var>dim</var> is omitted, and <var>x</var> is
a vector or matrix, it defaults to 1 (column wise quantiles).  In the
instance that <var>x</var> is a N-d array, <var>dim</var> defaults to the first
dimension whose size greater than unity.

        <p>The methods available to calculate sample quantiles are the nine methods
used by R (http://www.r-project.org/).  The default value is METHOD = 5.

        <p>Discontinuous sample quantile methods 1, 2, and 3

          <ol type=1 start=1>
<li>Method 1: Inverse of empirical distribution function. 
<li>Method 2: Similar to method 1 but with averaging at discontinuities. 
<li>Method 3: SAS definition: nearest even order statistic.
             </ol>

        <p>Continuous sample quantile methods 4 through 9, where p(k) is the linear
interpolation function respecting each methods' representative cdf.

          <ol type=1 start=4>
<li>Method 4: p(k) = k / n. That is, linear interpolation of the empirical cdf. 
<li>Method 5: p(k) = (k - 0.5) / n. That is a piecewise linear function where
the knots are the values midway through the steps of the empirical cdf. 
<li>Method 6: p(k) = k / (n + 1). 
<li>Method 7: p(k) = (k - 1) / (n - 1). 
<li>Method 8: p(k) = (k - 1/3) / (n + 1/3).  The resulting quantile estimates
are approximately median-unbiased regardless of the distribution of <var>x</var>. 
<li>Method 9: p(k) = (k - 3/8) / (n + 1/4).  The resulting quantile estimates
are approximately unbiased for the expected order statistics if <var>x</var> is
normally distributed.
             </ol>

        <p>Hyndman and Fan (1996) recommend method 8.  Maxima, S, and R
(versions prior to 2.0.0) use 7 as their default.  Minitab and SPSS
use method 6.  <span class="sc">matlab</span> uses method 5.

        <p>References:

          <ul>
<li>Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988) The New
S Language.  Wadsworth &amp; Brooks/Cole.

          <li>Hyndman, R. J. and Fan, Y. (1996) Sample quantiles in
statistical packages, American Statistician, 50, 361&ndash;365.

          <li>R: A Language and Environment for Statistical Computing;
<a href="http://cran.r-project.org/doc/manuals/fullrefman.pdf">http://cran.r-project.org/doc/manuals/fullrefman.pdf</a>. 
</ul>
        </p></blockquote></div>

<!-- ./statistics/base/prctile.m -->
   <p><a name="doc_002dprctile"></a>

<div class="defun">
&mdash; Function File: <var>y</var> = <b>prctile</b> (<var>x, p</var>)<var><a name="index-prctile-1824"></a></var><br>
&mdash; Function File: <var>q</var> = <b>prctile</b> (<var>x, p, dim</var>)<var><a name="index-prctile-1825"></a></var><br>
<blockquote><p>For a sample <var>x</var>, compute the quantiles, <var>y</var>, corresponding
to the cumulative probability values, P, in percent.  All non-numeric
values (NaNs) of X are ignored.

        <p>If <var>x</var> is a matrix, compute the percentiles for each column and
return them in a matrix, such that the i-th row of <var>y</var> contains the
<var>p</var>(i)th percentiles of each column of <var>x</var>.

        <p>The optional argument <var>dim</var> determines the dimension along which
the percentiles are calculated.  If <var>dim</var> is omitted, and <var>x</var> is
a vector or matrix, it defaults to 1 (column wise quantiles).  In the
instance that <var>x</var> is a N-d array, <var>dim</var> defaults to the first
dimension whose size greater than unity.

        </blockquote></div>

<!-- ./statistics/base/meansq.m -->
   <p><a name="doc_002dmeansq"></a>

<div class="defun">
&mdash; Function File:  <b>meansq</b> (<var>x</var>)<var><a name="index-meansq-1826"></a></var><br>
&mdash; Function File:  <b>meansq</b> (<var>x, dim</var>)<var><a name="index-meansq-1827"></a></var><br>
<blockquote><p>For vector arguments, return the mean square of the values. 
For matrix arguments, return a row vector containing the mean square
of each column.  With the optional <var>dim</var> argument, returns the
mean squared of the values along this dimension. 
</p></blockquote></div>

<!-- ./statistics/base/std.m -->
   <p><a name="doc_002dstd"></a>

<div class="defun">
&mdash; Function File:  <b>std</b> (<var>x</var>)<var><a name="index-std-1828"></a></var><br>
&mdash; Function File:  <b>std</b> (<var>x, opt</var>)<var><a name="index-std-1829"></a></var><br>
&mdash; Function File:  <b>std</b> (<var>x, opt, dim</var>)<var><a name="index-std-1830"></a></var><br>
<blockquote><p>If <var>x</var> is a vector, compute the standard deviation of the elements
of <var>x</var>.

     <pre class="example">          std (x) = sqrt (sumsq (x - mean (x)) / (n - 1))
</pre>
        <p>If <var>x</var> is a matrix, compute the standard deviation for
each column and return them in a row vector.

        <p>The argument <var>opt</var> determines the type of normalization to use.  Valid values
are

          <dl>
<dt>0:<dd>  normalizes with N-1, provides the square root of best unbiased estimator of
  the variance [default]
<br><dt>1:<dd>  normalizes with N, this provides the square root of the second moment around
  the mean
</dl>

        <p>The third argument <var>dim</var> determines the dimension along which the standard
deviation is calculated. 
<!-- Texinfo @sp should work but in practice produces ugly results for HTML. -->
<!-- A simple blank line produces the correct behavior. -->
<!-- @sp 1 -->

     <p class="noindent"><strong>See also:</strong> <a href="doc_002dmean.html#doc_002dmean">mean</a>, <a href="doc_002dmedian.html#doc_002dmedian">median</a>. 
</p></blockquote></div>

<!-- ./statistics/base/var.m -->
   <p><a name="doc_002dvar"></a>

<div class="defun">
&mdash; Function File:  <b>var</b> (<var>x</var>)<var><a name="index-var-1831"></a></var><br>
<blockquote><p>For vector arguments, return the (real) variance of the values. 
For matrix arguments, return a row vector containing the variance for
each column.

        <p>The argument <var>opt</var> determines the type of normalization to use. 
Valid values are

          <dl>
<dt>0:<dd>Normalizes with N-1, provides the best unbiased estimator of the
variance [default]. 
<br><dt>1:<dd>Normalizes with N, this provides the second moment around the mean. 
</dl>

        <p>The third argument <var>dim</var> determines the dimension along which the
variance is calculated. 
</p></blockquote></div>

<!-- ./statistics/base/mode.m -->
   <p><a name="doc_002dmode"></a>

<div class="defun">
&mdash; Function File: [<var>m</var>, <var>f</var>, <var>c</var>] = <b>mode</b> (<var>x, dim</var>)<var><a name="index-mode-1832"></a></var><br>
<blockquote><p>Count the most frequently appearing value.  <code>mode</code> counts the
frequency along the first non-singleton dimension and if two or more
values have the same frequency returns the smallest of the two in
<var>m</var>.  The dimension along which to count can be specified by the
<var>dim</var> parameter.

        <p>The variable <var>f</var> counts the frequency of each of the most frequently
occurring elements.  The cell array <var>c</var> contains all of the elements
with the maximum frequency . 
</p></blockquote></div>

<!-- ./statistics/base/cov.m -->
   <p><a name="doc_002dcov"></a>

<div class="defun">
&mdash; Function File:  <b>cov</b> (<var>x, y</var>)<var><a name="index-cov-1833"></a></var><br>
<blockquote><p>Compute covariance.

        <p>If each row of <var>x</var> and <var>y</var> is an observation and each column is
a variable, the (<var>i</var>, <var>j</var>)-th entry of
<code>cov (</code><var>x</var><code>, </code><var>y</var><code>)</code> is the covariance between the <var>i</var>-th
variable in <var>x</var> and the <var>j</var>-th variable in <var>y</var>. 
If called with one argument, compute <code>cov (</code><var>x</var><code>, </code><var>x</var><code>)</code>. 
</p></blockquote></div>

<!-- ./statistics/base/cor.m -->
   <p><a name="doc_002dcor"></a>

<div class="defun">
&mdash; Function File:  <b>cor</b> (<var>x, y</var>)<var><a name="index-cor-1834"></a></var><br>
<blockquote><p>Compute correlation.

        <p>The (<var>i</var>, <var>j</var>)-th entry of <code>cor (</code><var>x</var><code>, </code><var>y</var><code>)</code> is
the correlation between the <var>i</var>-th variable in <var>x</var> and the
<var>j</var>-th variable in <var>y</var>.

     <pre class="example">          corrcoef(x,y) = cov(x,y)/(std(x)*std(y))
</pre>
        <p>For matrices, each row is an observation and each column a variable;
vectors are always observations and may be row or column vectors.

        <p><code>cor (</code><var>x</var><code>)</code> is equivalent to <code>cor (</code><var>x</var><code>, </code><var>x</var><code>)</code>.

        <p>Note that the <code>corrcoef</code> function does the same as <code>cor</code>. 
</p></blockquote></div>

<!-- ./statistics/base/corrcoef.m -->
   <p><a name="doc_002dcorrcoef"></a>

<div class="defun">
&mdash; Function File:  <b>corrcoef</b> (<var>x, y</var>)<var><a name="index-corrcoef-1835"></a></var><br>
<blockquote><p>Compute correlation.

        <p>If each row of <var>x</var> and <var>y</var> is an observation and each column is
a variable, the (<var>i</var>, <var>j</var>)-th entry of
<code>corrcoef (</code><var>x</var><code>, </code><var>y</var><code>)</code> is the correlation between the
<var>i</var>-th variable in <var>x</var> and the <var>j</var>-th variable in <var>y</var>.

     <pre class="example">          corrcoef(x,y) = cov(x,y)/(std(x)*std(y))
</pre>
        <p>If called with one argument, compute <code>corrcoef (</code><var>x</var><code>, </code><var>x</var><code>)</code>. 
</p></blockquote></div>

<!-- ./statistics/base/kurtosis.m -->
   <p><a name="doc_002dkurtosis"></a>

<div class="defun">
&mdash; Function File:  <b>kurtosis</b> (<var>x, dim</var>)<var><a name="index-kurtosis-1836"></a></var><br>
<blockquote><p>If <var>x</var> is a vector of length N, return the kurtosis

     <pre class="example">          kurtosis (x) = N^(-1) std(x)^(-4) sum ((x - mean(x)).^4) - 3
</pre>
        <p class="noindent">of <var>x</var>.  If <var>x</var> is a matrix, return the kurtosis over the
first non-singleton dimension.  The optional argument <var>dim</var>
can be given to force the kurtosis to be given over that
dimension. 
</p></blockquote></div>

<!-- ./statistics/base/skewness.m -->
   <p><a name="doc_002dskewness"></a>

<div class="defun">
&mdash; Function File:  <b>skewness</b> (<var>x, dim</var>)<var><a name="index-skewness-1837"></a></var><br>
<blockquote><p>If <var>x</var> is a vector of length n, return the skewness

     <pre class="example">          skewness (x) = N^(-1) std(x)^(-3) sum ((x - mean(x)).^3)
</pre>
        <p class="noindent">of <var>x</var>.  If <var>x</var> is a matrix, return the skewness along the
first non-singleton dimension of the matrix.  If the optional
<var>dim</var> argument is given, operate along this dimension. 
</p></blockquote></div>

<!-- ./statistics/base/statistics.m -->
   <p><a name="doc_002dstatistics"></a>

<div class="defun">
&mdash; Function File:  <b>statistics</b> (<var>x</var>)<var><a name="index-statistics-1838"></a></var><br>
<blockquote><p>If <var>x</var> is a matrix, return a matrix with the minimum, first
quartile, median, third quartile, maximum, mean, standard deviation,
skewness and kurtosis of the columns of <var>x</var> as its columns.

        <p>If <var>x</var> is a vector, calculate the statistics along the
non-singleton dimension. 
</p></blockquote></div>

<!-- ./statistics/base/moment.m -->
   <p><a name="doc_002dmoment"></a>

<div class="defun">
&mdash; Function File:  <b>moment</b> (<var>x, p, opt, dim</var>)<var><a name="index-moment-1839"></a></var><br>
<blockquote><p>If <var>x</var> is a vector, compute the <var>p</var>-th moment of <var>x</var>.

        <p>If <var>x</var> is a matrix, return the row vector containing the
<var>p</var>-th moment of each column.

        <p>With the optional string opt, the kind of moment to be computed can
be specified.  If opt contains <code>"c"</code> or <code>"a"</code>, central
and/or absolute moments are returned.  For example,

     <pre class="example">          moment (x, 3, "ac")
</pre>
        <p class="noindent">computes the third central absolute moment of <var>x</var>.

        <p>If the optional argument <var>dim</var> is supplied, work along dimension
<var>dim</var>. 
</p></blockquote></div>

   </body></html>