Linear Algebra Functions with NAG Support
Many LinearAlgebra package functions use NAG routines to carry out computations. The use of the NAG routines allows computations to be carried out very quickly.
This worksheet will guide you through techniques to maximize the power and speed of these routines.
>
|
|
>
|
|
In order to see when NAG routines are called, set the infolevel for LinearAlgebra to 1.
>
|
|
| (1) |
|
Determining Whether Hardware Floats are Used
|
|
When Maple uses hardware floats, it can execute more quickly than when software floats or symbolics are used. By understanding how Maple decides which algorithm to use, you can ensure that the most efficient approach is used for your data.
|
Types of Algorithms and Precision Available
|
|
Hardware Floats
Advantage
- very fast: compiled (NAG) code is used with hardware floats to carry out algorithms
Drawback
- precision is predetermined
Software Floats
Advantages
- fast: compiled (NAG) code is used with Maple software floats to carry out algorithms
- arbitrary precision available (precision determined by setting of the Digits environment variable)
Drawback
- not as fast as hardware floats
Symbolic
Advantages
- symbolic entries handled
- exact answers often possible
Drawback
- relatively slow: interpreted (Maple) code is used to carry out algorithms
|
|
How Maple Decides at What Precision to Operate
|
|
Maple first determines whether or not floats and/or symbolics are present. To do so, it first checks the datatype of the Matrix, then (if necessary) checks the individual entries. In a large Matrix, this second step can take a non-trivial amount of time. For efficient code, ensure your Matrices have an appropriate datatype.
If there are only numerics, at least one float, and the UseHardwareFloats environment variable is set to deduced (the default), hardware floats are used.
If there are only numerics, at least one float, and the UseHardwareFloats environment variable is set to false, software floats are used.
If there are any non-numeric entries in a Matrix, a symbolic routine is used. Note that non-numeric entries include values such as and .
If there are no floats, a symbolic routine is used.
Note: A "float" is any entry with a decimal in it, so "2" is not a float, but "2." is a float.
In the example that follows, a symbolic (exact) algorithm is used to invert Matrix M1, while a floating point routine is used to invert Matrix M2. In this case, hardware floats are used since UseHardwareFloats=deduced.
>
|
|
| (1.2.1) |
>
|
|
| (1.2.2) |
>
|
|
MatrixInverse: calling external function
MatrixInverse: NAG hw_f07adf
MatrixInverse: NAG hw_f07ajf
| |
| (1.2.3) |
With UseHardwareFloats=false, software floats are used to invert M2.
>
|
|
| (1.2.4) |
>
|
|
| (1.2.5) |
>
|
|
MatrixInverse: calling external function
MatrixInverse: NAG sw_f07adf
MatrixInverse: NAG sw_f07ajf
| |
| (1.2.6) |
>
|
|
| (1.2.7) |
|
|
Setting the Hardware Flag
|
|
By default, the UseHardwareFloats environment variable is set to deduced to maximize the speed of computations. To turn off hardware computations and use software floats instead, set UseHardwareFloats to false.
Execute one of the following two command groups depending on whether you want to experiment with hardware floats or software floats.
>
|
|
| (1.3.1) |
>
|
|
| (1.3.2) |
If you have infolevel[LinearAlgebra] set to at least 1, you will notice that the external routines have names prefixed with "hw_" when hardware floats are being used, and with "sw_" when software floats are being used.
|
|
|
Controlling What Information is Returned
|
|
Some functions return more than one piece of information, or give you a choice of formats in which to obtain the results. By using the output= option, you can specify what information you would like a routine to return.
For example, when using LUDecomposition, you can specify which components of the decomposition you want returned. The examples below demonstrate the LUDecomposition command, but this functionality is also available in others, such as BidiagonalForm, QRDecomposition, and HessenbergForm.
For this example, we also demonstrate the use of software floats.
>
|
|
| (2.1) |
Set up a sample Matrix:
>
|
|
| (2.2) |
The default output is to return 'P' (permutation), 'L' (lower triangular), and 'U' (upper triangular) components:
>
|
|
LUDecomposition: calling external function
LUDecomposition: NAG sw_f07adf
| |
LUDecomposition: calling external function
LUDecomposition: NAG sw_f07adf
| |
| (2.3) |
The upper triangular component can also be automatically factored into a square upper triangular Matrix ('U1') and the row-reduced echelon form of M:
>
|
|
LUDecomposition: calling external function
LUDecomposition: NAG sw_f07adf
RowOperation: calling external function
RowOperation: External sw_MulRowRR
RowOperation: calling external function
RowOperation: External sw_AddRowRR
RowOperation: calling external function
RowOperation: External sw_AddRowRR
RowOperation: calling external function
RowOperation: External sw_MulRowRR
RowOperation: calling external function
RowOperation: External sw_AddRowRR
RowOperation: calling external function
RowOperation: External sw_MulRowRR
| |
| (2.4) |
Usually you will want the whole decomposition, but if you only want part of it, you can ask for just the part you want:
>
|
|
LUDecomposition: calling external function
LUDecomposition: NAG sw_f07adf
| |
| (2.5) |
You can also request 'NAG' format. This special format is particularly space efficient: the permutation Matrix is compressed into a pivot vector, and the L and U Matrices are combined into a single Matrix. This format is understood by other commands, such as LinearSolve (note that the order of the vector and matrix must be preserved when passing to other routines).
>
|
|
LUDecomposition: calling external function
LUDecomposition: NAG sw_f07adf
| |
LinearSolve: using method LU
LinearSolve: calling external function
LinearSolve: NAG sw_f07aef
| |
| (2.6) |
>
|
|
| (2.7) |
|
|
Controlling the Format of the Output
|
|
Functions that return Matrices allow you to specify Matrix characteristics via the outputoptions option. This may involve making a copy of the Matrix that was originally produced by the function. (To determine if such a copy has been made, set infolevel[ LinearAlgebra ] := 2.)
The following examples create a random Matrix with a particular shape and an updatable scalar Matrix.
>
|
|
| (3.1) |
| (3.2) |
|
|
Minimizing Storage Costs
|
|
There are two main ways to minimize storage costs:
- use the 'inplace' feature
- create your Matrices with the right properties so that copies do not have to be made
Note that the first of these requires you to use the named version of the function (for example, Multiply instead of the '.' operator). The second applies to both named functions and to functions called via operators.
|
Using the 'inplace' Feature
|
|
The inplace feature allows you to store the output of a function directly on top of one of the inputs. In order for feature to be included in a function, two requirements must be met:
(1) the output must be the same shape and size as one of the inputs, and
(2) it must be possible to code the algorithm truly in-place; it must not just copy the result into the input object when it is done.
Some examples will clarify the above points.
(1) Multiply can be done inplace when the output object is the same size as the first input object ; this is the case whenever the second Matrix is square. (Strictly speaking, the Multiply function is not quite carried out in place -- at any given time, there is a copy of one row of the Matrix in memory -- but this overhead is small compared to creating an entire new Matrix).
| (4.1.1) |
Note that Multiply cannot be done in place when the output and input are not the same size:
Error, (in LinearAlgebra:-MatrixMatrixMultiply) the second factor must be square to compute the product in place
| |
(2) CrossProduct takes two 3-element Vectors as inputs, and returns a 3-element Vector. It looks like a promising candidate for an inplace operation. However, as can be seen below, overwriting any of the input values would obliterate data needed to compute the output values for other locations.
| (4.1.2) |
|
|
Avoiding Copies
|
|
By setting the infolevel for LinearAlgebra to 2, you can see when avoidable copies are made.
>
|
|
| (4.2.1) |
The NAG algorithms require data to be in a floating-point (either hardware or software) Matrix. They work fastest when data is in a hardware float Matrix. Compare the following: notice that the M1 Matrix has to be copied, while the M2 Matrix can be used directly.
BackwardSubstitute: calling external function
BackwardSubstitute: copying lefthand Matrix, to obtain datatype float[8]
BackwardSubstitute: NAG hw_f07tef
| |
BackwardSubstitute: calling external function
BackwardSubstitute: NAG hw_f07tef
| |
| (4.2.2) |
|
|
|
Choosing a Method
|
|
Many LinearAlgebra commands can be carried out in more than one way. Maple looks at the type of data it has received to determine the "best" algorithm to use.
For example, LUDecomposition has a number of techniques at its disposal, including Gaussian elimination, fraction-free Gaussian elimination, and Cholesky. If you know which algorithm is most appropriate for your Matrix, you can save the LUDecomposition function the task of having to decide which one to use. Simply specify the desired method with the method= option.
Note: When you specify a particular method, no other method will be tried, even if problems are encountered. For example, consider the following Matrix which is symmetric. It is not positive-definite, but it has mistakenly been given that attribute. When you call LinearSolve without specifying a method, it first tries the Cholesky method. When that fails, it tries LU and gets the right answer. When the Cholesky method is explicitly requested, it simply returns with an error.
>
|
|
| (5.1) |
>
|
|
LinearSolve: using method Cholesky
LinearSolve: calling external function
LinearSolve: NAG hw_f07gdf
Warning, Matrix is not positive-definite
| |
LinearSolve: external call unsuccessful
LUDecomposition: calling external function
LUDecomposition: NAG hw_f07adf
MatrixVectorMultiply: calling external function
MatrixVectorMultiply: NAG hw_f06paf
BackwardSubstitute: calling external function
BackwardSubstitute: NAG hw_f07tef
| |
LinearSolve: using method Cholesky
LinearSolve: calling external function
LinearSolve: NAG hw_f07gdf
Error, (in LinearAlgebra:-LA_Main:-LinearSolve) Matrix is not positive-definite
| |
|
|
LUDecomposition - Some Routine-Specific Hints
|
|
Some variants of the LUDecomposition function can be carried out inplace (reducing storage costs) when the input Matrix has a float datatype.
When you request NAG format output with the inplace option, the LU component is returned in the input Matrix. The pivot vector is returned separately.
LUDecomposition: calling external function
LUDecomposition: NAG hw_f07adf
| |
| (6.1) |
When you request the Cholesky method with the inplace option, the L component is returned in the original Matrix. For efficiency, the upper triangle is not necessarily replaced with zeroes.
LUDecomposition: calling external function
Transpose: calling external function
HermitianTranspose: hw_MatTransRR
MatrixAdd: calling external function
MatrixAdd: NAG hw_f06ecf
MatrixNorm: calling external function
MatrixNorm: NAG hw_f06raf
CholeskyDecomposition: NAG hw_f07fdf
| |
| (6.2) |
|
|
LinearSolve - Some Routine-Specific Hints
|
|
The LinearSolve command allows you to solve a linear system with one or more right-hand sides. Each time you execute the LinearSolve command, it has to factor the input Matrix. You can save time by providing all the right-hand sides in a single command so that it only has to factor the Matrix once:
| (7.1) |
>
|
|
LinearSolve: using method LU
LinearSolve: calling external function
LinearSolve: NAG hw_f07adf
LinearSolve: NAG hw_f07aef
| |
| (7.2) |
If you do not have all the right-hand sides in advance, use LUDecomposition to factor the input Matrix, and pass this factorization (in a list) to LinearSolve. You can use any decomposition, but for maximum speed with floating-point data, use 'NAG' format. To minimize storage costs, use the inplace option (if available), as described in the previous section.
LUDecomposition: calling external function
LUDecomposition: NAG hw_f07adf
| |
LinearSolve: using method LU
LinearSolve: calling external function
LinearSolve: NAG hw_f07aef
| |
| (7.3) |
>
|
|
Note: Using 'NAG' format with non-numeric entries saves storage, but it has an execution time comparable to other formats.
|
Return to Index for Example Worksheets
|