Interpolation and Smoothing - Maple Help

Home : Support : Online Help : System : Information : Updates : Maple 16 : Interpolation and Smoothing

Interpolated Plotting and Smoothing in Maple 16

The topic of this worksheet is interpolation and smoothing of given 2-dimensional and 3-dimensional data. The organization is by dimension, task, and regularity of the data.

One common task is to generate only a plot of the data, as a curve or surface which passes through or approximates the data points.

Another task is to generate a procedure or piecewise spline expression which approximates the data and which can be queried for a value at any individual point nearby the original data points.

Another choice is whether to produce a curve or surface which passes through the given data points (and so interpolates them) or which simply approximates them (by smoothing). Is the data known to contain noise or measurement error? If the dependent data is expected to contain error or noise then it is reasonable to fit a smoothed surface or curve to the data, where the smoothed fit does not necessarily pass through or match the dependent data at the given data points. On the other hand, if the data is expected to be fully correct then it is reasonable to produce an interpolated curve or surface which passes through all the given data points.

A curve or surface may also be plotted directly from the data, or indirectly by first creating a procedure or (in the 2D case) a piecewise spline, and then subsequently supplying that to the plot or plot3d command. The constructed procedure would accept individual 1D or 2D independent data points and compute a dependent scalar value for that input.

Another relevant distinction for 3D data is whether the independent portion of the data lies on a regular grid in the 2D plane or whether it is comprised of points which are irregularly spaced in both directions.

This worksheet elaborates and compares these approaches for 2D and 3D plotting.

 > restart; with(LinearAlgebra): with(CurveFitting): with(plots): with(Statistics): randomize():

1-dimensional case

The x-data points are taken uniformly in the range, as the integers from 1 to 15. But they could just as well be unequally spaced, as long as they are ordered.

Here, the data2D is a 15x2 Matrix with the first column being the ordered x-data points and the second column being their y-values.

 > data2D:=< <($1..15)> | RandomVector(15,generator=0.0..5.0) >: Here is the plot command producing a linear interpolation. The goal is to produce a smooth curve instead.  > display( plot(data2D, color=black), pointplot(data2D, symbolsize=10, color=red) ); We produce a piecewise object of interpolating splines, and plot this expression.  > f := Spline(data2D, v): display(plot(f, v=1..15, color=black), pointplot(data2D, symbolsize=10, color=red) ); The above method of producing a symbolic expression of a piecewise spline is simplest. It allows the plotter to pick as many interpolating points as it wishes. The Spline command can be called with several options. But that is not very efficient for handling a large number of original data points, in which case the resulting piecewise expression can become unwieldy and the cost of its numerical evaluation at many points more noticeable. Let's look next at using a pre-made mesh of points at which to interpolate using the ArrayInterpolation command. We produce a fixed mesh, interpolate at those points, and plot those values. With 15 original data points, we'll create a finer mesh A1 of 150 points, with points laid uniformly 1/10 apart. This is both fast and uses less memory. But it might not produce the most attractive plots since the evaluation points are evenly spaced along the x-axis. For some curves the difference between this approach's results and that of the default adaptive 2D plotter would be noticeable.  > A1:=<(i/10$i=1..150)>: f:=ArrayInterpolation(data2D, A1, method=spline): g:=ArrayInterpolation(data2D, A1, method=cubic): drawfun := f_ ->     display(plot(, color=black),             pointplot(data2D, symbolsize=10, color=red)):
 > display(Array([drawfun(f), drawfun(g)]), view=[0..15, 0..5]);

Obtaining an interpolating procedure

The above methods use the pre-made Vector A1 of fixed, uniform interpolating points. It is also possible to allow the plot command to dynamically choose interpolating points, provided that command is supplied with a procedure that produces a single scalar numeric output for any given numeric input. The next example below demonstrates an operator F which acts in this way.

Note that while F does use the original data it does not require a set of interpolating points to be supplied up front. See the appendix for discussion of more efficient ways to implement this approach.

For 2-dimensional plotting Maple does adaptive plotting by default, where it chooses to evaluate a plotted function more densely in certain areas according to how it detects the function as changing. See the plot,options help page for more details on the adaptive option to the plot command.

 > F:=x->CurveFitting:-ArrayInterpolation(data2D,               Array(1..1, 1..1, [[x]]), method=spline)[1]: display( plot(F, 0..15, color=black),          pointplot(data2D, symbolsize=10, color=red), view=[0..15, 0..5] );

General use

Interpolating procedures such as F can also be used in other general calculations, as a scalar-valued function.

 > Optimization:-Minimize(t->F(t), 0..15, method=branchandbound);
 $\left[{-}{0.0459382095810952}{,}\left[\begin{array}{c}{8.30778366525865}\end{array}\right]\right]$ (1.1.1.1)
 > fsolve(t->F(t)-2, 0..15);
 ${1.808411009}$ (1.1.1.2)

2-dimensional case

Below we'll distinguish between data interpolation, which matches the given data points exactly, and data smoothing, which approximates noisy data.

Smoothing

In this section, we are assuming that we have a collection of 2-dimensional independent data for which the 1-dimensional dependent data has some error component or is known to be noisy. The presence of noise implies that whatever surface this data represents does not necessarily pass through the given dependent data points.

In this case, a surface is approximated by numerically smoothing the data using the lowess algorithm. For any given point to be plotted a window of close enough, surrounding data points will be used to compute a local, weighted, low order fit.

Scatterplot3D

The command ScatterPlot3D from the Statistics package provides a smoothed plot of the 2D noisy data. The independent data does not need to be regularly spaced, and is supplied as an n-by-3 Array or Matrix. Each of the n rows represents an individual point. The columns are the  x-, y-, and z-values.

Here is an example. First, we'll look down upon the projection of the data points onto the x-y plane, and thus visualize the layout and spacing of the 2D dependent data values.

 > X := Sample(Uniform(-50,50),175):
 > Y := Sample(Uniform(-50,50),175):
 > Zerror := Sample(Normal(0,100),175):
 > Z := Array(1..175,(i)->-(sin(Y[i]/20)*(X[i]-6)^2+(Y[i]-7)^2+Zerror[i])):
 > XYZ := Matrix([[X],[Y],[Z]],datatype=float[8])^%T;
 ${\mathrm{XYZ}}{:=}\left[\begin{array}{c}{\mathrm{175 x 3}}{\mathrm{Matrix}}\\ {\mathrm{Data Type:}}{{\mathrm{float}}}_{{8}}\\ {\mathrm{Storage:}}{\mathrm{rectangular}}\\ {\mathrm{Order:}}{\mathrm{Fortran_order}}\end{array}\right]$ (2.1.1.1)
 > ScatterPlot3D(XYZ, axes=box, orientation=[20,0,0]);
 > ScatterPlot3D(XYZ, lowess, grid=[25,25], axes=box, orientation=[20,70,0]);

Here is another example, which reads in the n-by3 data from a file.

 > M:=ImportMatrix(cat(kernelopts(mapledir),"/data/plotting/irregulardata3D.csv"),                 source=csv,datatype=float[8]):
 > Ppt := pointplot3d(M,axes=box,style=patchnogrid,symbolsize=15,color=red,                           labels=[x,y,z]):
 > display(Ppt,orientation=[90,0,0]);

The default options for the ScatterPlot3D command are to use an order 2 fitting quadratic form, and to use a window of width 1/3 of the original domain in each of the x- and y-directions.

On the right below is the smoothed surface, while on the left is a slightly transparent view of the same surface overlaid with a point-plot of the original data points.

 > Ploess := ScatterPlot3D( M, lowess ):
 > display(Array([          display([Ppt,Ploess]),          display(Ploess,style=patch,color=RGB(0.0,0.4,0.6))                       ]),view=700..860,axes=box,orientation=[54,78,-15],                          transparency=0.0);

Interpolation

In this section we are assuming that we have a collection of 2-dimensional independent data for which the 1-dimensional dependent data is accepted as correct. The goal is to produce a surface which must must pass through all of the data points. That is, at every x-y point of the independent data the height of the plotted surface must match the corresponding value of the dependent (z) data.

In this case, a surface is computed numerically by interpolating the data.

This falls into one of two distinct situations. The first case has all the independent x-y data forming a regular grid, and the second case consists of irregularly spaced independent data points in the x-y plane. In both cases, the surface may be plotted using the surfdata command.

Uniform data grid

In the case of a uniform 2-dimensional grid of data a surface plot can be generated using the surfdata command.

The data is comprised of a grid of x- and y-points. For the examples of this section, the data points are taken uniformly in each direction. The x-values are taken as as the integers from 1 to 7, and the y-values are taken as the integers from 1 to 9. (See the next sections, for non-uniform data points.)

View of data points in the x-y plane

Here is a view of the input data points, which in this example is a full grid, uniformly spaced in both the x and y directions.

 > xpts:=<($1..7)>: ypts:=<($1..9)>:
 > display( seq(plottools:-line([xpts[i],1],[xpts[i],9]), i=1..7),          seq(plottools:-line([1,ypts[j]],[7,ypts[j]]), j=1..9),          pointplot([seq(seq([xpts[i],ypts[j]],j=1..9),i=1..7)], color=red) );

The given data values corresponding to each data point are stored in data which is a 7x9 Array. A random collection of values is generated for this example.

 > data3D:=Array( LinearAlgebra:-RandomMatrix(7, 9, generator=0.2 .. 0.8), datatype=float[8]):

The first 3D plot below is the piecewise-planar surface produced by default. The goal is to produce a smoother surface instead, and this will be accomplished by using the gridsize and interpolation options of the surfdata command.

The plotted surface is being overlaid here with both the 3D point-plot as well as the (linear) patch-surface, so that the computed surface may be visually demonstrated to be authentic with respect to the original discrete data.

 > ptsplot:=PLOT3D( GRID(1..7, 1..9, data3D, STYLE(POINT) ) ): gridplot:=PLOT3D( GRID( 1..7, 1..9, data3D, STYLE(WIREFRAME)) ):

The following plot3d options control the general look of these plots, and are reused in this subsection. They are assigned to a single, reusable name so that the differences between methods is more clearly illustrated.

 > lookandfeel := axes=boxed, style=surface, glossiness=1.0, lightmodel=light1,                color=RGB(204/255, 84/255, 0/255), view=[1..7, 1..9, 0..1]:
 > display(    ptsplot,gridplot,    surfdata(data3D, 1..7, 1..9, lookandfeel)                 );

On the left below is the default surface produced by the surfdata command, without any additional interpolation.

On the right below is the surface produced by the surfdata command using the new interpolating optional keyword parameters gridsize and interpolation.

The gridsize parameter is an equation of the form gridsize=list, where the the list contains a pair of positive integers denoting the size of the uniform grid of points at which interpolation of the original data will be performed.

The interpolation parameter is an equation of the form interpolation=list, where the list contains options accepted by the CurveFitting:-ArrayInterpolation command.

 > display( Array([      surfdata( data3D, 1..7, 1..9, lookandfeel ),      surfdata( data3D, 1..7, 1..9,                gridsize=[28,36], interpolation=[method=spline],                lookandfeel)                  ]) );

 > display( Array([      display( ptsplot, gridplot,               surfdata( data3D, 1..7, 1..9, lookandfeel ) ),      display( ptsplot, gridplot,               surfdata( data3D, 1..7, 1..9,                         gridsize=[28,36], interpolation=[method=spline],                         lookandfeel) )                  ]) );

Note that the 3D plot renderer does its own small amount smoothing of the surface. Hence, even when using the purely linear method of the computational interpolation scheme, the plot on the right below shows a modest level of surface smoothing.

 > display( Array([      display( ptsplot, gridplot,               surfdata( data3D, 1..7, 1..9, lookandfeel ) ),      display( ptsplot, gridplot,               surfdata( data3D, 1..7, 1..9,                         gridsize=[28,36], interpolation=[method=linear],                         lookandfeel) )                  ]) );

The refined grid on which values are interpolated does not need to contain a whole integer multiple (in either direction) of the number points of the original grid of data points.

 > display( Array([      display( ptsplot, gridplot,               surfdata( data3D, 1..7, 1..9, lookandfeel ) ),      display( ptsplot, gridplot,               surfdata( data3D, 1..7, 1..9,                         gridsize=[33,33], interpolation=[method=cubic],                         lookandfeel) )                  ]) );

Interpolating procedure

Next we produce and use interpolating procedures f3 and g3 which each return an interpolated scalar value for any given (x,y) point. This allows the plot3d command to be used directly, choosing whichever (x,y) points it needs. Note that at present there is less benefit to using such an interpolating 2-parameter procedure since plot3d does not currently do adaptive plotting.

This approach is also less efficient that the previous method of using a pre-made grid of (x,y) points, as it invokes ArrayInterpolation and produces temporary data structures for each input point. See the appendix for further discussion of efficiency in this regard.

The resulting interpolating procedure might also be used in other general computations such as numerical optimization, integration, and root-finding. Bear in mind that the results of such computations will vary with the chosen interpolation method.

 > f3:=(x,y)->CurveFitting:-ArrayInterpolation([xpts,ypts], data3D,               Array(1..1, 1..1, 1..2, [[[x,y]]]), method=spline)[1,1]: g3:=(x,y)->CurveFitting:-ArrayInterpolation([xpts,ypts], data3D,               Array(1..1, 1..1, 1..2, [[[x,y]]]), method=cubic)[1,1]: drawfun3 := f_ ->     plot3d(f_, 1..7, 1..9, axes=boxed, style=surface,            glossiness=1.0, lightmodel=light1,            color=RGB(204/255, 84/255, 0/255)):
 > display(Array([drawfun3(f3), drawfun3(g3)]), view=[1..7, 1..9, 0..1]);

 > Optimization:-Minimize(f3, method=nonlinearsimplex);
 $\left[{-}{1.98108713564723402}{}{{10}}^{{61}}{,}\left[\begin{array}{c}{-}{9.6283299190}{}{{10}}^{{10}}\\ {-}{5.50389069382191}{}{{10}}^{{9}}\end{array}\right]\right]$ (2.2.1.2.1)
 > fsolve({(a,b)->a-b, (a,b)->f3(a,b)-0.5}, [1..7, 1..9]);
 $\left[{3.3483539130504871}{,}{3.3483539130504871}\right]$ (2.2.1.2.2)
 > f3(op(%)) - 0.5;
 ${-}{2.69384514695048}{}{{10}}^{{-12}}$ (2.2.1.2.3)

Non-uniform data grid

This is the case of irregularly spaced data, supplied as an n-by-3 Array or Matrix. Each row represents a point, which the columns interpreted as x-, y-, and z-value.

Here below is an example. First, we'll look down upon the projection of the data points onto the x-y plane and thus visualize the layout and spacing of the 2D dependent data values.

This is the same data that was used for the ScatterPlot3D example in the Smoothing section. But in contrast to the smoothing example we are now supposing that the data points are to be interpolated, which is to say that the surface must pass directly through the data points.

 > M:=ImportMatrix(cat(kernelopts(mapledir),"/data/plotting/irregulardata3D.csv"),                 source=csv,datatype=float[8]):
 > Ppt := pointplot3d(M,axes=box,style=patchnogrid,symbolsize=15,color=red,                           labels=[x,y,z]):
 > display(Ppt,orientation=[90,0,0]);

The surfdata command can now handle this task directly.

The option source=irregular is supplied to denote that this data is not to be interpreted as forming a regularly spaced grid with only three values along one of the two independent data dimensions.

 > surfdata(M,source=irregular,axes=box);

Appendix: efficiency

The command CodeTools:-Usage is used below in order to print information about memory and time resources used by individual computations or steps.

One area of concern for efficiency is unnecessary and avoidable copying of data by the interpolating process. The routines which do the actual work are all implemented as external calls to precompiled functions, and these expect the input and output to be conveyed as rtables (Arrays, Vector, or Matrices) with the hardware double-precision datatype=float[8] option.

Let's consider the original example, with 1-dimensional data. A relatively large number of 5000 interpolation points will be used in order to more clearly demonstrate some performance differences.

Note that using a preallocated structure of output points, such as A1, is the most efficient way to obtain an interpolated plot. The resources needed to generate the structure A1 itself are negligible.

(We are not concerned primarily with the appearance or smoothness of the plot here. This is an investigation into relative efficiency. The number of distinct plotted points is forced to be high, which can produce incidental jaggedness.)

 > A1:=CodeTools:-Usage(Vector(5000, i->i*15/5000, datatype=float[8])):
 memory used=236.03KiB, alloc change=0 bytes, cpu time=0ns, real time=5.00ms

 > CodeTools:-Usage(plot(,                       color=black));
 memory used=27.77MiB, alloc change=0.59MiB, cpu time=340.00ms, real time=338.00ms

Unnecessary repeated copying of the independent data by an interpolating function such a F can also be avoided by supplying ArrayInterpolation with rtables which have the float[8] datatype.

In addition, the optional 'container' argument of the ArrayInterpolation command can be used to attain in-place semantics, thus avoiding repeated creation and memory management of an rtable just to convey the scalar output results.

Finally, by implementing the interpolating function as a module, the dependent input point can also be conveyed in a re-usable rtable and thus avoid repeated memory management of yet another rtable object. The module Fnew below illustrates these ideas.

 > Fnew := module() export ModuleApply; local inArray, outArray, hwxdata, hwydata;    inArray:=Array(1..1,'datatype'=float[8]);    outArray:=Array(1..1,'datatype'=float[8]);    hwxdata,hwydata := Vector(data2D[1..-1,1],'datatype'=float[8]),                       Vector(data2D[1..-1,2],'datatype'=float[8]):    ModuleApply:=proc(x)       UseHardwareFloats:=true;       inArray[1]:=x;       CurveFitting:-ArrayInterpolation(hwxdata,hwydata,               inArray, 'method'='spline', 'container'=outArray);       outArray[1];    end proc; end module:

Now we'll compare performance of F against that of Fnew. There is a modest improvement in speed of approximately a factor of two for this example. Remember that 5000 points are not required in order to attain a nice, smooth plot. The purpose of requesting that many points is mostly to demonstrate the performance difference.

Both of the plots below are considerably slower to produce than was the earlier plot which used a pre-made fixed Vector A1 of the same number of output points.

 > CodeTools:-Usage( plot(F, 0..15, color=black, numpoints=5000) );
 memory used=245.00MiB, alloc change=113.44MiB, cpu time=2.82s, real time=2.83s

The procedure Fnew is approximately twice as fast as F. With some significant additional coding effort certain aspects of the overhead due to Fnew's repeated calling of ArrayInterpolation might be avoided, and so provide further significant speed enhancement.. Notice that F is thread-safe while Fnew is not, since module local variables such as module Fnew's inArray (or outArray) local variable are accessed in a thread-global manner. When multiple distinct calls to Fnew are run concurrently in separate Threads then the concurrent accesses to those local Arrays might interfere with each other.

 See Also For additional details on the commands used in the worksheet, including additional options to control the interpolating and smoothing methods, see the following help pages: