In principal component analysis (PCA) of near-IR spectra it is common practice to examine the vectors in the transformation (or loadings) matrix in an attempt to explain the total spectral variation in terms of recognizable constituent near- IR spectra. At times, however, even the most distinctive spectral signatures in the vectors may prove to be unreliable. Bootstrap estimates of the principal components, calculated from calibration sample spectra, reveal which wavelengths contain reliable information in each vector of the transformation matrix by providing t-statistics for each element of the transformation matrix. The t-statistics can indicate that the presence of a particular peak in a vector spectrum is not significant, or can indicate that the absence of a particular peak in a vector spectrum does not necessarily mean that the peak would be absent given another similar group of calibration samples.
Calculation of the bootstrap estimate begins with the collection of a
set of calibration spectra T. Random selections are made from T
by filling P with the calibration sample indices to be used in b
different bootstrap sample sets, each designated
,
and then the values in P are scaled to the calibration
set size n by
A series of bootstrap samples, each designated
, are then created
![]()
A distinct transformation matrix
can be formed from the eigenvalues
and eigenvectors
of the correlation matrices R
of each . To form the transformation matrices, the square roots of the eigenvalues
of each R fill the diagonal of a square matrix, and the
matrix product of the result and
gives an L, which becomes
a transformation matrix
upon inversion. One
is calculated for each .
The mean and standard deviation of each element of b different
matrices are used to calculate t-statistics for each element of the original
transformation matrix.