Natgrid efficiency considerations

Main time components

The time required for executing Natgrid codes is primarily dependent on three things:

The time required to triangulate the input data and determine all natural neighbors.
The number of output points where interpolated values are desired.
Whether linear or nonlinear interpolation is used.

Detailed timing results

A table summarizing a timing analysis done on a Cray Y-MP is provided if you want details. The basic conclusions are:

For a fixed number of input and output data, linear interpolation is, in general, over twice as fast as nonlinear interpolation.
For a fixed number of output points, the timings for linear interpolation vary as the square of the number of input points.
For a fixed number of input points, the efficiency of non-linear interpolation increases as the number of output points is increased (based on the time-per-point calculations).
Interpolating at a single point is very expensive, especially if non-linear interpolation is being used.

If you can get satisfactory results using linear interpolation, then that should be used. If you have a large number of input data that are randomly spaced, then linear interpolation will not differ too much from nonlinear.

Single point mode is very expensive and, in general, should be used only for filling in a few missing data values.

When time is a major concern

If execution time is a major consideration, then Natgrid allows for significant speed-up using a process called subblocking. The main idea is to split up the output array into subblocks, do the interpolation on each individual subblock, and then composite the results. Unfortunately, this process takes some work on the part of the Natgrid user.

For example, suppose that you are interpolating on a 100 x 100 output grid. You could split this into four separate interpolations of 50 x 50 (or one hundred of 10 x 10, and so forth). Since the time increases basically as a function of the square of the number of input points, much is saved by splitting the original problem.

Beyond the care needed to make sure that the individual interpolations are correctly stitched back together, there are constraints imposed by Natgrid to insure that the interpolated values along adjacent subblock edges are compatible. These constraints are:

For linear interpolation: if an input coordinate (X,Y) is contained in a subblock, then every natural neighbor of (X,Y) must be included in calculating interpolated values.
For nonlinear interpolation: if an input coordinate (X,Y) is contained in a subblock, then every natural neighbor of (X,Y) must be included in calculating interpolated values and every natural neighbor of every natural neighbor of (X,Y) must also be included.

The set of points in the input dataset included in a calculation of interpolated values in a subblock is controlled by the subblock overlap parameters hor and ver. These parameters determine additional space around the interpolated grid that will include additional input data. For example, if your output grid for a subblock were the lower left quarter of the unit square and you chose hor=0.1 and ver=0.05, then all input points in the region bounded by X = -0.1 on the left, X = 0.60 on the right, Y = -0.05 on the bottom, and Y = 0.55 on the top would be included in the calculation.

What is a good way to determine values for hor and ver? One way would be to make a test run using a sparse output grid and then use the features of the algorithmic display tool for displaying natural neighbors; another way would be to make an educated guess based on your knowledge of what a natural neighbor is and what your data look like.

Natgrid efficiency considerations

Main time components

Detailed timing results

General conclusions

When time is a major concern