For SPLINA, the main storage requirements of the program are proportional
to the square of the number of data points and the processing time is
proportional to cube of the number of data points. For SPLINB, storage
requirements are proportional to the square of the number of knots and
processing time is approximately proportional to the cube of the number of
knots, since a singular value decomposition of a matrix of this order is
required. The required linear algebra routines are contained within the
double precision LINPACK library (Dongarra et al., 1979). The required
LINPACK routines can be supplied with ANUSPLIN.
SPLINA does not permit coincident data points. SPLINB does permit
coincident data points. Coincident knots are not permitted. Simplified
procedures for processing coincident data points are being developed.
Program Output
--------------
Summary statistics and a list of the 100 largest residuals, ranked in
descending order, are always written to the standard output unit.
Depending on values of the user directives, a list of the data and
fitted values with estimated standard errors may be written to output
and files containing the coefficients of the fitted surfaces and the
parameters used to calculate the optimum smoothing parameter may be
written. The surface coefficients are used to calculate values of the
fitted surface by other programs in ANUSPLIN, including LAPGRD and
LAPPNT. The ranked list of residuals is particularly useful in
detecting miscellaneous data errors. The list of data and fitted values
is also useful for this task, especially when fitting a surface to a new
data set.
Knot Selection for SPLINB
-------------------------
The following, somewhat heuristic, procedure works well in practice:
1. Use SELNOT to choose an initial set of knots from the data points.
This selects a specified number of data positions as knots by attempting
to equi-sample the independent spline variable space. While it is
theoretically possible to choose knots which are not at data positions,
this is not currently supported by SPLINB. In the limiting case where
every data position is a knot, then the actions of SPLINA and SPLINB are
identical. Choose an initial set of knots about 1/3 the size
of the data set. The number of knots required depends on the spatial
complexity of the data being fitted - more knots for more spatially
variable data. If the signal of the final result is within 10% of the
number of knots (the maximum possible signal), then the process should be
re-done starting with a larger initial knot set.
2. Run SPLINB, with the output list of data and fitted values, and
examine the largest residuals for data errors. Re-fit the surface if
necessary after data errors have been corrected, usually without the
output list of data and fitted values. Add to the knot index file the
indices of the largest 15-25 residuals which are not already knots and
re-fit the surface with these additional knots. Note that each knot
index refers to the sequence number in the original data file and not
the sequence number of the points actually selected. It is usual, but
not essential, to maintain the increasing order of the sequence numbers
in the knot index file. Residuals which are already associated with a
knot are identified by a minus sign, both in the output ranked residual
list, and in the list of data and fitted values.
3. Repeat the procedure of adding to the knot list the indices of the
largest 15-25 residuals and re-fitting the surface until the solution
stabilises or the variance estimates output by the program are in
approximate agreement with a priori estimates. This should normally