create violin plots from data
ViolinPlot(X, Y, options, plotoptions)
(optional) data to be plotted in conjunction with X. When Y is specified, the command will draw half a violin plot for each individual sample in X and Y respectively. This can make it easier to directly compare the the distributions.
(optional) equation(s) of the form option=value where option is one of datasetlabels, offset, distance, width, mean, symbol, quantiles, interval, divider, orientation, color, filled, scale, method, range, kernel, bins, left, right, or bandwidth; specify options for generating the violin plot
options to be passed to the plots[display] command
The ViolinPlot command generates a violin plot for the specified data. A violin plot is a visualization of the distribution of data consisting of a rotated kernel density plot and markers for the quartiles and the mean.
The parameter X is either a single data sample - given as e.g. a Vector - or a list of data samples. Note that the individual samples may be of variable size.
The optional parameter Y is either a single data sample - given as e.g. a Vector - or a list of data samples. Note that the individual samples may be of variable size but the number of individual samples must be equal to the number samples specified in X.
The options argument can contain one or more of the options shown below. All unrecognized options will be passed to the plots[display] command. See plot/options for details.
datasetlabels=default or list
Data set labels for the individual violin plots. The labels appear along the axes. By default, the labels are set to 1, 2, 3, etc.
Initial offset along the x-axis. The default value is 0.
Note: By default, the view wraps tightly around all visible plot objects and the horizontal axis is marked by data set labels, not regular coordinates, so this option will have no (visual) effect. It is meant for the case where this plot is combined with other plot elements.
This option controls the distance between the violin plots. The default value is 0.25.
This option controls the width of the violin plots. The default value is 0.75.
The following plot illustrates how the options offset, distance, and width are interpreted.
Note the lengths of the arrows labeled "offset", "width" and "distance" correspond to values for the offset, width and distance options respectively.
mean=true or false
If this option is set to true then the mean is included in the plot. The default value is true.
symbol=name or list
This option specifies the symbol type for the points representing the mean. By default, the symbol type is diamond. When two data sets are given, X and Y, you can specify one symbol type or a list of two symbol types. Providing a list of two names specifies the symbol types for for plotting the points corresponding to the means of X and Y.
quantiles=list of lists
This option can be used to mark specific quantiles. Quantiles are represented by horizontal lines of specified length. Each sublist is a list containing two elements where the first element specifies the quantile to be marked and the second element specifies the length of the horizontal line. The default value is [ [3/4,.5], [1/2, .75], [1/4, .5] ]
This option controls the amount of space inserted between the two halves of a violin plot. The default value is set to 0.
divider=true or false
If divider is set to true then a vertical line is drawn to separate the halves of a violin plot. By default divider=false in which case no such line is drawn.
orientation=horizontal or vertical
Indicate the orientation of the violin plots. The default is vertical. The option descriptions in this help page assume the orientation is set to vertical as well.
color=name, list, or range
This option specifies colors for the individual data sets. If a range of colors is given, the colors are generated by selecting an appropriate number of equally spaced points in the corresponding hue range. For a list of colors, the behavior depends on whether the optional data Y is specified. In the case where Y is specified, the list can have at most two colors where the halves corresponding to the first data are colored using the first value in the list and the halves corresponding to the optional data are colored using the second value. Otherwise each of the violin plots is colored with the corresponding color in the list.
filled=true or false
If the filled option is set to true, the area inside each of the violin plots is filled with a solid color. The color value of a filled area is set to the color of the adjacent curve. The default value is set to true.
The option specifies the method used to scale the width of violin plots. The option can be specified in the following ways:
scale=width(scalelistX, scalelistY) or scale=width
In the case where the scaling method is specified as width or width(scalelistX, scalelistY), the violin plots are scaled so that their widths are proportional to the values specified in scalelistX and optionally scalelistY, where the value 1 represents the width that makes the violin plot fill the bounding box defined by the width option exactly.
In all cases, scalelistX and scalelistY are specified as lists of numeric values.
The default value for a scalelist is a list of all ones.
Note scale=width is used as default.
scale=area(scaletype, scalelistX, scalelistY) or scale=area
When the scaling method is specified as area, by default the violin plots are scaled such that their areas are proportional to the values specified in scalelistX and optionally scalelistY, and the maximal width among all of them is given by the width option. Then if scaletype is setwise, the proportional comparison is only within the X data set and within the Y data set. Finally, if scaletype is pairwise, the proportional comparison is within each pair of (X, Y)-data sets plotted next to each other. See the Examples section for a specific example using this option.
scale=count(scaletype) or scale=count
In the case where scaling method specified to be count, by default the violin plots are scaled such that the area of a particular violin plot is proportional to number of observations in the associated data sample, and the maximal width among all of them is given by the width option. Then if scaletype=setwise, the proportional comparison is only within the X data set and within the Y data set. Finally, if scaletype=pairwise, the proportional comparison is within each pair of (X, Y)-data sets plotted next to each other.
If Y data is specified but only one list of scales is provided, then the specified list of scales is used to scale both sets of Violin Plots. However this generates an error if scaletype is set to pairwise.
Note that the number of elements in scalelistX and scalelistY when specified, must at least be equal to the number of elements in X and Y respectively.
method=exact or piecewise
This parameter specifies the method of plotting the kernel density estimate (by default this is piecewise). For more information, see Statistics[KernelDensity].
range=deduce or realrange
By default this is deduce. This option is used to specify the vertical range in the violin plot.
kernel=gaussian, biweight, epanechnikov, triangular, or rectangular
The default value is gaussian. This option allows a non-Gaussian kernel to be used in developing the estimate. For more information, see Statistics[KernelDensity].
The number of bins in which to categorize data points (128 by default). This value must be a power of 2 and is equal to the size of the array returned by KernelDensity when the option method=piecewise is specified. This parameter is ignored if method=exact.
This option specifies the lower boundary on valid data values. Any data values that are smaller than this value are discarded. By default, this procedure will impose boundary conditions consistent with the specified range rng.
This option specifies the upper boundary on valid data values. Any data values that are smaller than this value are discarded. By default, this procedure will impose boundary conditions consistent with the specified range rng.
The bandwidth is a positive quantity that specifies the width of the kernel (the amount each data point affects distant portions of the probability density estimate). Each kernel is scaled such that the bandwidth is equal to the standard deviation of the kernel.
Note that the labels for the data sets are placed on the axes, and should not be confused for coordinates.
A ≔ Array⁡−1.,−0.4,−0.2,0.,0.,0.1,0.2,0.7,0.9:
The commands to create the plot from the Plotting Guide are
N ≔ RandomVariable⁡Normal⁡0,1:
S ≔ Sample⁡N,16:
C ≔ seq⁡Sample⁡Normal⁡ln⁡i,3,60,i=1..20:
F ≔ seq⁡Sample⁡Normal⁡sin⁡i⁢Pi,3,120,i=1..20:
The ViolinPlot command also accepts a Matrix. The columns are understood as individual data samples.
R ≔ seq⁡Sample⁡Normal⁡ln⁡i,3,10,i=1..3:
M ≔ Matrix⁡R,scan=columns%T:
Plot options such as title are passed to the plots:-display command:
ViolinPlot⁡M,color=Niagara Blue,title=Violin Plots
The Statistics[ViolinPlot] command was introduced in Maple 2017.
For more information on Maple 2017 changes, see Updates in Maple 2017.
Download Help Document