Histograms of statistical data
stats[statplots, histogram](data, arg=value, ...)
statplots[histogram](data, arg=value, ...)
histogram(data, arg=value, ...)
where x is a number or a descriptive statistics item
where y is an integer greater than zero
Important: The stats package has been deprecated. Use the superseding package Statistics instead.
The function histogram of the subpackage stats[statplots] plots a bar graph for one or more statistical lists.
The command histogram(data1) will tally the data in data1 into bars of equal area. Classes (ranges) contained in data1 are plotted as boxes proportional to their weight.
When more than one statistical list is specified, as in histogram(data1, data2, ...), a 3-D plot is created. One histogram is created for each data set, and they are plotted on the same graph. The data1 plot is closest to the viewer, and each subsequent data set is behind it. The i^th data set, is plotted along the plane z=i, with the z-axis pointed away from the viewer.
By default, a probability histogram will be produced: the total area of the bars will be 1. This can be changed by using the area parameter; area=10 will give a histogram with the bars having a total area of 10. To make the total area of the bars equal to the total weight of the data, use area=count.
The parameter numbars= allows the user to specify how many divisions the data should be broken up into. If the data is spread uniformly, then numbars=y should produce a histogram with y columns. Less uniform data may result in empty columns. When this parameter is not specified, a representative default value is chosen.
By default, bars will be of equal area and not necessarily equal width. When using the area=... parameter, the default changes so that the bars are always of equal width.
The function stats[transform,split] can be used to generate histograms in which each bar has the same area.
Another choice is to use classes that have the same width. The function stats[transform, tallyinto] can be used to collect point data into class data. It is usually a good idea to group data into between 5 and 20 classes. The boundaries of the classes should not include actual data points.
One disadvantage of using histograms for the display of data is that their appearance can change drastically depending on the positions of the boundaries of the classes. This is somewhat lessened if the class marks (mid-points of the classes) are chosen to be actual data items. For example, the class 1..3 would be acceptable if the data contains the item 2, but would be rejected if the data does not contain the item 2.
Missing data are ignored.
The command with(stats[statplots]) allows the use of the abbreviated form of this command.
data1 ≔ Weight⁡1..3,5,Weight⁡3..5,10,Weight⁡5..7,8:
Some randomly generated numbers:
Here is a more advanced example. Data2 was generated via a standard normal random number generator. We want to plot a histogram of this and also have the theoretical distribution.
data2 ≔ −1.96,−0.814,1.86,1.96,0.519,0.739,−0.0540,0.702,0.663,0.591,0.580,0.475,0.589,−1.33,0.0420,−0.460,−0.482,1.58,0.778,0.530,−0.507,−0.233,−0.195,0.193,−0.136:
default: probability histogram, equal area regions
p ≔ histogram⁡data2,color=yellow:
q ≔ plot⁡statsstatevalf,pdf,normald,−3..3,color=red:
probability histogram, with equal width regions, and overlay theoretical
p ≔ histogram⁡data2,color=gray,area=1:
data2classified ≔ statstransform,scaleweight1nops⁡data2⁡statstransform,tallyinto⁡data2,−3..−1,−1..1,1..3:
Download Help Document