Data Smoothing - Maple Help

はじめる前に
新機能一覧
Maple ワークシートの作成
Mapleワークシートを共有
Maple ウィンドウのカスタマイズ
Maple システムのカスタマイズ
コネクティビティ
Mathematics
数学
- テンソル解析
- パッケージ
- ファンクションアドバイザ
- ベキ級数
- ベクトル解析
- 不活性関数
- 代数
- 因数分解と方程式解法
- 基本数学
- 基本的な注意事項
- 変分法
- 変換
- 幾何
- 微分代数方程式
- 微分幾何学
- 微分方程式
- 微積分
- 数
- 数値計算
- 数学関数
- 数論
- 最適化
- 特殊関数
- 統計
  - ProcessControl パッケージ
  - Statistics パッケージ
    - 例題
    - シミュレーション
    - データ処理
    - データ平滑化
    - 可視化
    - 回帰
    - 検定
    - 確率分布
    - 確率変数
    - 統計推量
    - 統計量
    - 要約と一覧
    - 記述統計
    - 概要
    - コマンド
    - 対話的データ解析
    - 効率的な計算
    - 行列データセット
    - 確率分布の作成
- 線形代数
- 群論
- 評価
- 論理
- 過去のパッケージ
- 金融
- 離散数学
- piecewise examples
- グラフ電卓
- 区分関数
物理
プログラミング
グラフィックス
Student パッケージ
Science and Engineering
科学・エンジニアリング
アプリケーションと例題
リファレンス
システム
MapleSim
MapleSim ツールボックス
MapleSim ヘルプ
Tasks
Toolboxes
エラーメッセージガイド
マニュアル
数学アプリ

Data Smoothing

The Statistics package provides several functions for performing data smoothing - the process of extracting identifiable patterns from data and obscuring noise. The data smoothing package includes algorithms to produce smoothed data (MovingAverage, MovingStatistic, ExponentialFit) or to produce an estimation curve to approximate the distribution of the population (ie. kernel density estimation).

1 Data Filters

The Statistics package includes several data filters for smoothing otherwise rough data including moving average, moving median, moving statistic, a general linear filter, exponential fit and weighted moving average.

1.1 Stock Prices

This example demonstrates the use of data filters in analyzing stock prices.

Consider the following function that generates a sample stock path over N time periods. The stock is considered to have initial cost S0, trend parameter r and fluctuation parameter sigma.

StockPath := proc (N::posint, S0::realcons, r::realcons, sigma::realcons) local h, i, C, R, S; h := 1./(N-1); C := evalf(exp(r*h-(1/2)*sigma^2*h)); R := C*exp(sigma*sqrt(h)*RandomVariable(Normal(0, 1))); S := Sample(R, N+1); S[1] := S0; return CumulativeProduct(S) end proc

Generate a sample stock path over 500 time periods and plot.

The data smoothing functions provided in the Statistics library now give us a means to analyze the overall trend of the data while disregarding small fluctuations. Consider the moving average function, which calculates the average value of a window around each data point.

Exponential smoothing can also be applied. This method works by 'smoothing' out rough edges, generally caused by cyclic or irregular patterns in the data.

1.2 Department Store Sales

This example demonstrates the use of data filters in analyzing sales at a department store.

Consider the following function that randomly generates the times of n sales at a department store. The rate of sales is represented by the parameter r and the deviation in this rate by the parameter theta.

SaleTimes := proc (N::realcons, r::realcons, theta::realcons) local R, S, T, i; R := r*RandomVariable(Exponential(theta)); S := Sample(R, N); return CumulativeSum(S) end proc

Consider the first 100 sales with rate parameter 0.5 and deviation parameter 0.2.

The overall trend is readily apparent with the application of the moving average filter.

2 Kernel Density Estimation

The Statistics package provides algorithms for computing, plotting and sampling from kernel density estimates. A kernel density estimate is a continuous probability distribution used to approximate the population of a sample, constructed by considering a normalized sum of kernel functions for each data point.

The following is an example of Maple's kernel density estimation routines in action.

Consider the following bimodal data sample (hypothesized as bimodal since there appear to be two distinct clusterings of data - those in the range -1.2 to -0.8 and those in the range 0.7 to 0.9).

By applying kernel density estimation, we can create a function to interpolate the data. Since our data sample is relatively small, we can perform exact kernel density estimation. The exact method of kernel density estimation returns a probability density function which can then be evaluated at specific points.

(2.1)

We can convert the kernel density estimate to a distribution using one of the standard RandomVariable constructors.

(2.2)

(2.3)

This probability density function can also be plotted, in this case against the cumulative distribution function.

P2 := plot(CDF(R, x), x = -2.5 .. 2.5, thickness = 3, color = blue)

With the KernelDensitySample function, similar data can be quickly drawn from a data sample.

A kernel density estimate can be directly plotted using the KernelDensityPlot function. The following example demonstrates the difference between different choices of bandwidth.

P3 := KernelDensityPlot(A, bandwidth = .6, kernel = biweight, method = exact, color = navy, thickness = 2, range = -2 .. 2)

In most cases, only a few hundred samples are needed to roughly approximate the original probability distribution with a kernel density estimate.

Available Kernels

Kernel density estimation requires the use of a kernel function - a normalized continuous function that is mapped to each data point. Five standard kernel functions are available with kernel density estimation.

2.1 Gaussian Kernel

The Gaussian kernel should be used with continuous data that is defined on the whole real line. It possesses the familiar bell shape and is based on the Gaussian probability density function.

2.2 Triangular Kernel

The triangular kernel is a piecewise function related to the triangular distribution. This kernel generally creates a kernel density estimate with sharp edges, although remaining relatively smooth.

2.3 Rectangular Kernel

The rectangular kernel is a piecewise function related to the uniform distribution. This kernel creates a kernel density estimate that resembles a staircase function.

2.4 Biweight Kernel

The biweight kernel is a smooth kernel that is defined on a finite interval, unlike the gaussian kernel. It should be used for bounded data that is smooth along the interval it is defined upon.

2.5 Epanechnikov Kernel

The Epanechnikov kernel is the standard kernel for kernel density estimation. It generally provides the closest match to a probability density function under most circumstances. The kernel itself is a rounded function similar to the biweight, except it is not differentiable at its boundaries.