Gaussian Process Regression
The GaussianProcessRegression
class defines an approximate surrogate model or response surface which can be used to predict the model response and its uncertainty at points where the model has not been previously evaluated. Gaussian Process regression utilizes the concepts of Bayesian modelling and identifies a suitable function, which fits the training data (\(X \in \mathbb{R}^{n \times d}, Y \in \mathbb{R}^{n, 1}\)). First, it formulates a prior function over the output variable (\(y=g(x))\)) as a Gaussian Process, which can be defined using a mean function and a kernel.
The mean function \(m(x)\) is given as a linear combination of ‘\(p\)’ chosen scalar basis functions as:
The kernel \(k(x, s)\) defines the covariance function as:
and the Gaussian process can be written as,
where, \(\beta\) is the regression coefficient estimated by least square solution and \(\theta=\{l_1, ..., l_d, \sigma \}\) are a set of hyperparameters generally governing the correlation length (lengthscale, \(l_i\)) and the process variance (\(\sigma\)) of the model, determined by maximixing the log-likelihood function
The covariance is evaluated between a set of existing sample points \(X\) in the domain of interest to form the covariance matrix \(K=K(X, X)\), and the basis functions are evaluated at the sample points \(X\) to form the matrix \(F\). Using these matrices, the regression coefficients, \(\beta\), is computed as
The joint distribution between the training outputs (\(g(X)\)) and test outputs (\(g(X^*)\)) can be expressed as:
The final predictor function is then given by the mean of the posterior distribution \(g(X^*)|Y, X, X^*\), defined as:
and the covariance matrix of the posterior distribution is expressed as:
In case of noisy output (i.e. \(y = g(x)+\epsilon\)), where noise \(\epsilon\) is a independent gaussian distribution with variance \(\sigma_n^2\). The GaussianProcessRegression
class includes noise standard deviation in the hyperparameters (\(\theta=\{l_1, ..., l_d, \sigma, \sigma_n \}\)) along with the lengthscales and process standard deviation, and identify them by maximixing log-likelihood function. The mean and covariance of the posterior distribution is modified by substituting \(K\) as \(K+\sigma_n^2 I\):
Regression Models
The GaussianProcessRegression
class offers a variety of built-in regression models, specified by the regression input described below.
Constant Model
The ConstantRegression
class is imported using the following command:
>>> from UQpy.surrogates.gaussian_process.regression_models.ConstantRegression import ConstantRegression
In Constant model, the regression model is assumed to take a constant value such that
Linear Model
The LinearRegression
class is imported using the following command:
>>> from UQpy.surrogates.gaussian_process.regression_models.LinearRegression import LinearRegression
The regression model is defined by the linear basis function on each input dimension.
Quadratic Model
The QuadraticRegression
class is imported using the following command:
>>> from UQpy.surrogates.gaussian_process.regression_models.QuadraticRegression import QuadraticRegression
The quadratic regression model is given by:
User-Defined Regression Model
Adding a new regression model to the GaussianProcessRegressor
class is straightforward. This is done by creating a new class
that evaluates the basis functions, by extending the Regression
.
The Regression
class is imported using the following command:
>>> from UQpy.surrogates.gaussian_process.regression_models.baseclass.Regression import Regression
- class Regression[source]
Abstract base class of all Regressions. Serves as a template for creating new Gaussian Process regression functions.
This class may be passed directly as an object to the regression input of the GaussianProcessRegression
class.
This new class must have a method r(self,s)
that takes as input the samples points at which to evaluate the model
and return the value of the basis functions at these sample points.
The output of this function should be a two dimensional numpy array with the first dimension being the number of samples and the second dimension being the number of basis functions.
An example user-defined model is given below:
>>> class UserRegression(Regression):
>>>
>>> def r(self, s):
>>> s = np.atleast_2d(s)
>>> fx = np.ones([np.size(s, 0), 1])
>>> return fx
Kernels
The GaussianProcessRegression
class offers a variety of built-in kernels, specified by the kernel input described below.
Radial Basis Function Kernel
The RBF
class is imported using the following command:
>>> from UQpy.surrogates.gaussian_process.kernels.RBF import RBF
The RBF kernel takes the following form:
where \(h_i = s_i-x_i\).
Matern Kernel
The Matern
class is imported using the following command:
>>> from UQpy.surrogates.gaussian_process.kernels.Matern import Matern
The Matern kernel takes the following form:
where \(d = ||x-s||_2^{1/2}\) is the euclidean distance and \(\theta\) is consist of lengthscale (\(l\)), process variance (\(\sigma^2\)) and smoothing parameter (\(\nu\)). Also, \(\Gamma\) is tha gamma function and \(K_v\) is the modified Bessel function. This kernel concides with absolute exponential and RBF kernel for \(\nu=0.5\) and \(\infty\), respectively.
User-Defined Kernel
Adding a new kernel to the GaussianProcessRegression
class is straightforward. This is done by creating a new class
that extends the Kernel
abstract base class.
This new class must have a method c(self, x, s, params)
that takes as input the new points, training points and hyperparameters.
Notice that the input params
include lengthscales and process standard deviation, not noise standard deviation (even for noisy data).
The Kernel
class is imported using the following command:
>>> from UQpy.utilities.kernels.baseclass.Kernel import Kernel
The method should return covariance matrix, i.e. a 2-D array with first dimension being the number of new points and second dimension being the number of training points.
An example user-defined kernel is given below:
>>> class RBF(Kernel):
>>>
>>> def c(self, x, s, params):
>>> l, sigma = params[:-1], params[-1]
>>> stack = cdist(x/l, s/l, metric='euclidean')
>>> cx = sigma**2 * np.exp(-(stack**2)/2)
>>> return cx
Constraints
The GaussianProcessRegression
class offers a built-in constraints on the hyperparameters estimation while maximixing loglikelihood function.
NonNegative
The NonNegative
class is imported using the following command:
>>> from UQpy.surrogates.gaussian_process.constraints.NonNegative import NonNegative
The NonNegative constraints enforces non-negativity in the GP model, this is useful for treating many physical properties as output variable. This constraint improves prediction and error estimates, where model output is always greater than or equal to zero. This is ensured by solving the maximum likelihood problem with the following constraints.
where \(X_c\) is the set of constraint points defined by user, \(z\) is the number of standard deviation below mean, that is non-negative for constraint points. \(\epsilon\) is the error tolerance between output value and prediction at the sample points.
User-Defined Constraints
Adding a new kernel to the GaussianProcessRegression
class is straightforward. This is done by creating a new class
that extends the UQpy.surrogates.gaussian_process.kernels.baseclass.Constraints
abstract base class.
This new class must have two methods define_arguments(self, x_train, y_train, predict_function)
and constraints(theta, kwargs)
.
The UQpy.surrogates.gaussian_process.constraints.baseclass.Constraints
class is imported using the following command:
>>> from UQpy.surrogates.gaussian_process.constraints.baseclass.Constraints import ConstraintsGPR
- class ConstraintsGPR[source]
Abstract base class of all Constraints. Serves as a template for creating new Kriging constraints for log-likelihood function.
The first method takes the sample points, output values and prediction method as inputs. This method combines the input attributes
(defined while initiating the new class) and inputs of the define_arguments
method. This method returns a list containing a dictionary of all the arguments to the constraints function.
The second method constraints
is static method, which takes first input as the log-transformed hyperparameters (\(\theta\)) and second input is a dictionary containing all arguments as defined in the first method (define_arguments
).
User can unpacked the stored variables in the dictionary and return the constraints function. Notice that this is a static method as optimizers require constraints to be a callable.
The constraints on the hyperparameters are only compatible with FminCobyla
class as optimizer.
An example user-defined kernel is given below:
>>> class NonNegative(ConstraintsGPR):
>>> def __init__(self, constraint_points, observed_error=0.01, z_value=2):
>>> self.constraint_points = constraint_points
>>> self.observed_error = observed_error
>>> self.z_value = z_value
>>> self.kwargs = {}
>>> self.constraint_args = None
>>>
>>> def define_arguments(self, x_train, y_train, predict_function):
>>> self.kwargs['x_t'] = x_train
>>> self.kwargs['y_t'] = y_train
>>> self.kwargs['pred'] = predict_function
>>> self.kwargs['const_points'] = self.constraint_points
>>> self.kwargs['obs_err'] = self.observed_error
>>> self.kwargs['z_'] = self.z_value
>>> self.constraint_args = [self.kwargs]
>>> return self.constraints
>>>
>>> @staticmethod
>>> def constraints(theta_, kwargs):
>>> x_t, y_t, pred = kwargs['x_t'], kwargs['y_t'], kwargs['pred']
>>> const_points, obs_err, z_ = kwargs['const_points'], kwargs['obs_err'], kwargs['z_']
>>>
>>> tmp_predict, tmp_error = pred(const_points, True, hyperparameters=10**theta_)
>>> constraint1 = tmp_predict - z_ * tmp_error
>>>
>>> tmp_predict2 = pred(x_t, False, hyperparameters=10**theta_)
>>> constraint2 = obs_err - np.abs(y_t[:, 0] - tmp_predict2)
>>>
>>> constraints = np.concatenate((constraint1, constraint2), axis=None)
>>> return constraints
GaussianProcessRegression Class
The GaussianProcessRegression
class is imported using the following command:
>>> from UQpy.surrogates.gaussian_process.GaussianProcessRegression import GaussianProcessRegression
Methods
- class GaussianProcessRegression(kernel, hyperparameters, regression_model=None, optimizer=None, bounds=None, optimize_constraints=None, optimizations_number=1, normalize=False, noise=False, random_state=None)[source]
GaussianProcessRegressor an Gaussian process regression-based surrogate model to predict the model output at new sample points.
- Parameters:
kernel (
Kernel
) – kernel specifies and evaluates the kernel. Built-in options: RBFhyperparameters (
list
) – List or array of initial values for the kernel hyperparameters/scale parameters. This input attribute defines the dimension of input training point, thus its length/shape should be equal to the input dimension plus one (d+1), this list/array includes ‘d’ length scale and process standard deviation. In case of noisy observations/output, its length/shape should be equal to the input dimension plus two (d+2), this list/array includes ‘d’ lengthscales, process standard deviation and noise variance.regression_model – A class object, which computes the basis function at a sample point. If regression_model is None, this class will train GP with regression. Default: None
bounds – This input attribute defines the bounds of the loguniform distributions, which randomly generates new starting point for optimization problem to estimate maximum likelihood estimator. This should be a closed bound. Default: [10**-3, 10**3] for each hyperparameter and [10**-10, 10**-1] for variance in noise (if applicable).
optimizations_number (
int
) – Number of times MLE optimization problem is to be solved with a random starting point. Default: 1.normalize (
bool
) – Boolean flag used in case data normalization is required.optimizer – A class object of ‘MinimizeOptimizer’ or ‘FminCobyla’ from UQpy.utilities module. If optimizer is not defined, this class will not solve MLE problem and predictions will be based on hyperparameters provided as input. Default: None.
noise (
bool
) – Boolean flag used in case of noisy training data. Default: Falserandom_state (
Union
[None
,int
,RandomState
]) – Random seed used to initialize the pseudo-random number generator. If an integer is provided, this sets the seed for an object ofnumpy.random.RandomState
. Otherwise, the object itself can be passed directly.
- fit(samples, values, optimizations_number=None, hyperparameters=None)[source]
Fit the surrogate model using the training samples and the corresponding model values.
The user can run this method multiple time after initiating the
GaussianProcessRegressor
class object.This method updates the samples and parameters of the
GaussianProcessRegressor
object. This method uses hyperparameters from previous run as the starting point for MLE problem unless user provides a new starting point.- Parameters:
samples – ndarray containing the training points.
values – ndarray containing the model evaluations at the training points.
optimizations_number – number of optimization iterations
hyperparameters – List or array of initial values for the kernel hyperparameters/scale parameters.
The
fit()
method has no returns, although it creates the beta, err_var and C_inv attributes of theGaussianProcessRegressor
class.
- predict(points, return_std=False, hyperparameters=None)[source]
Predict the model response at new points.
This method evaluates the regression and correlation model at new sample points. Then, it predicts the function value and standard deviation.
- Parameters:
- Returns:
Predicted values at the new points, Standard deviation of predicted values at the new points
Attributes
- GaussianProcessRegression.beta
Regression coefficients.
- GaussianProcessRegression.err_var
Variance of the Gaussian random process.
- GaussianProcessRegression.C_inv
Inverse Cholesky decomposition of the correlation matrix.