Kriging is more complex than Natural Neighbor Interpolation. It requires both a model of the spatial continuity or dependence (in the form of a covariance or semivariogram), and a sample of surface data to determine the statistical trend on which to base interpolated/extrapolated points.
Spatial prediction using Kriging involves two steps:
You must select the output locations of the interpolated points. It is important to ensure that the sample data is appropriate for the interpolated point locations (the output). For example, do not select points on the opposite side of the surface to determine a trend for the interpolated/extrapolated points locations, as that trend may not be appropriate for the interpolated/extrapolated point locations.
For best performance, it is recommended to keep the sample data set small. The reason is that both the time to do the interpolation and the amount of memory used by the algorithm grow very quickly with the sample set size. The algorithm uses a matrix with one entry for each pair of points (N**2 entries, where N is the number of sample points). It later inverts this matrix (N**3 operations). So for good performance it is important to keep N small. We suggest at most 200 sample points.
Semivariance is a measure of the degree of spatial dependence between samples. The magnitude of the semivariance between points depends on the distance between the points. A smaller distance yields a smaller semivariance and a larger distance results in a larger semivariance. The plot of the semivariances as a function of distance from a point is referred to as a semivariogram.
Kriging provides five semivariogram models:
The semivariance increases as the distance increases until at a certain distance away from a point the semivariance will equal the variance around the average value, and will therefore no longer increase, causing a flat region to occur on the semivariogram called a sill. The distance from the point of interest to where the flat region begins is termed the range or span of the regionalized variable. Within this range, locations are related to each other, and all known samples contained in this region, also referred to as the neighborhood, must be considered when estimating the unknown point of interest.
The center of the neighborhood is usually the unknown value. To determine this value, all known values within the neighborhood are assigned weights using the semivariogram. These weights and known values are then used to calculate the unknown value.