The Statistics package provides two methods of testing goodness-of-fit. The ChiSquareGoodnessOfFitTest function should be used to determine if an observed or empirical data set fits expected values for that data set. Similarly, the ChiSquareSuitableModelTest is available for testing how well a given probability distribution approximates a data sample.
>
|
|
Consider the following number of sales made on each day of the week at a jewelry store, tallied over one sales week (Monday to Saturday).
>
|
|
We wish to test the hypothesis that sales are uniformly distributed throughout the week. The expected number of sales per day is then given by the number of sales averaged over the week.
>
|
|
>
|
|
We now test the hypothesis (using ChiSquareGoodnessOfFitTest) that the observed number of sales per day is consistent with a uniformly distributed number of sales each day.
>
|
|
Chi-Square Test for Goodness-of-Fit
-----------------------------------
Null Hypothesis:
Observed sample does not differ from expected sample
Alt. Hypothesis:
Observed sample differs from expected sample
Categories: 6
Distribution: ChiSquare(5)
Computed statistic: 5
Computed pvalue: 0.41588
Critical value: 11.07049741
Result: [Accepted]
There is no statistical evidence against the null hypothesis
| |
Hence we conclude that a uniformly distributed number of sales is a reasonable claim.
Consider a dataset of times during a day when sales are made. Determine if sales are uniformly distributed during the day (consider an 8 hour working day where sales are measured between 0.0 and 8.0, the number of hours into the day). The data in this case is continuous and we are testing against a uniform probability distribution.
>
|
|
Apply the chi square suitable model test to determine if a uniform distribution closely matches the provided data.
>
|
|
Chi-Square Test for Suitable Probability Model
----------------------------------------------
Null Hypothesis:
Sample was drawn from specified probability distribution
Alt. Hypothesis:
Sample was not drawn from specified probability distribution
Bins: 4
Distribution: ChiSquare(3)
Computed statistic: 9.5191
Computed pvalue: 0.023129
Critical value: 7.814728288
Result: [Rejected]
There exists statistical evidence against the null hypothesis
| |
Hence we conclude that the sale times are not uniformly distributed throughout the day. Closer examination of the data reveals that most of the sales were made roughly half way through the day.