Calculation of the PCE coefficients

Several methods exist for the calculation of the PCE coefficients. In UQpy, three non-intrusive methods can be used, namely the Least Squares regression (LeastSquaresRegression class), the LASSO regression (LassoRegression class) and Ridge regression (RidgeRegression class) methods.

Least Squares Regression

Least Squares regression is a method for estimating the parameters of a linear regression model. The goal is to minimize the sum of squares of the differences of the observed dependent variable and the predictions of the regression model. In other words, we seek for the vector \(\beta\), that approximatively solves the equation \(X \beta \approx y\). If matrix \(X\) is square then the solution is exact.

If we assume that the system cannot be solved exactly, since the number of equations \(n\) is not equal to the number of unknowns \(p\), we are seeking the solution that is associated with the smallest difference between the right-hand-side and left-hand-side of the equation. Therefore, we are looking for the solution that satisfies the following

\[\hat{\beta} = \underset{\beta}{\arg\min} \| y - X \beta \|_{2}\]

where \(\| \cdot \|_{2}\) is the standard \(L^{2}\) norm in the \(n\)-dimensional Eucledian space \(\mathbb{R}^{n}\). The above function is also known as the cost function of the linear regression.

The equation may be under-, well-, or over-determined. In the context of Polynomial Chaos Expansion (PCE) the computed vector corresponds to the polynomial coefficients. The above method can be used from the class LeastSquaresRegression.

LeastSquares Class

The LeastSquaresRegression class is imported using the following command:

>>> from UQpy.surrogates.polynomial_chaos.regressions.LeastSquareRegression import LeastSquareRegression

class LeastSquareRegression[source]

run(x, y, design_matrix)[source]

Least squares solution to compute the polynomial_chaos coefficients.

Parameters:

x (ndarray) – numpy.ndarray containing the training points (samples).
y (ndarray) – numpy.ndarray containing the model evaluations (labels) at the training points.
design_matrix (ndarray) – matrix containing the evaluation of the polynomials at the input points x.

Returns:

Returns the polynomial_chaos coefficients.

Lasso Regression

A drawback of using Least Squares regression for calculating the PCE coefficients, is that this method considers all the features (polynomials) to be equally relevant for the prediction. This technique often results to overfitting and complex models that do not have the ability to generalize well on unseen data. For this reason, the Least Absolute Shrinkage and Selection Operator or LASSO can be employed (from the LassoRegression class). This method, introduces an \(L_{1}\) penalty term (which encourages sparsity) in the loss function of linear regression as follows

\[\hat{\beta} = \underset{\beta}{\arg\min} \{ \frac{1}{N} \| y - X \beta \|_{2} + \lambda \| \beta \|_{1} \}\]

where \(\lambda\) is called the regularization strength.

Parameter \(\lambda\) controls the level of penalization. When it is close to zero, Lasso regression is identical to Least Squares regression, while in the extreme case when it is set to be infinite all coefficients are equal to zero.

The Lasso regression model needs to be trained on the data, and for this gradient descent is used for the optimization of coefficients. In gradient descent, the gradient of the loss function with respect to the weights/coefficients \(\nabla Loss_{\beta}\) is used and deducted from \(\beta^{i}\) at each iteration as follows

\[\beta^{i+1} = \beta^{i} - \epsilon \nabla Loss_{\beta}^{i}\]

where \(i\) is the iteration step, and \(\epsilon\) is the learning rate (gradient descent step) with a value larger than zero.

Lasso Regression Class

The LassoRegression class is imported using the following command:

>>> from UQpy.surrogates.polynomial_chaos.regressions.LassoRegression import LassoRegression

class LassoRegression(learning_rate=0.01, iterations=1000, penalty=1)[source]

Class to calculate the polynomial_chaos coefficients with the Least Absolute Shrinkage and Selection Operator (LASSO) method.

Parameters:

learning_rate (float) – Size of steps for the gradient descent.
iterations (int) – Number of iterations of the optimization algorithm.
penalty (float) – Penalty parameter controls the strength of regularization. When it is close to zero, then the Lasso regression converges to the linear regression, while when it goes to infinity, polynomial_chaos coefficients converge to zero.

run(x, y, design_matrix)[source]

Implements the LASSO method to compute the polynomial_chaos coefficients.

Parameters:

x (ndarray) – numpy.ndarray containing the training points (samples).
y (ndarray) – numpy.ndarray containing the model evaluations (labels) at the training points.
design_matrix (ndarray) – matrix containing the evaluation of the polynomials at the input points x.

Returns:

Weights (polynomial_chaos coefficients) and Bias of the regressor

Ridge Regression

Ridge regression (also known as \(L_{2}\) regularization) is another variation of the linear regression method and a special case of the Tikhonov regularization. Similarly to the Lasso regression, it introduces an additional penalty term, however Ridge regression uses an \(L_{2}\) norm in the loss function as follows

\[\hat{\beta} = \underset{\beta}{\arg\min} \{ \frac{1}{N} \| y - X \beta \|_{2} + \lambda \| \beta \|_{2} \}\]

where \(\lambda\) is called the regularization strength.

Due to the penalization of terms, Ridge regression constructs models that are less prone to overfitting. The level of penalization is similarly controlled by the hyperparameter \(\lambda\) and the coefficients are optimized with gradient descent. The Ridge regression method can be used from the .RidgeRegression class.

Ridge Class

The PceSensitivity class is imported using the following command:

>>> from UQpy.sensitivity.PceSensitivity import PceSensitivity

class RidgeRegression(learning_rate=0.01, iterations=1000, penalty=1)[source]

Class to calculate the polynomial_chaos coefficients with the Ridge regression method.

Parameters:

learning_rate (float) – Size of steps for the gradient descent.
iterations (int) – Number of iterations of the optimization algorithm.
penalty (float) – Penalty parameter controls the strength of regularization. When it is close to zero, then the ridge regression converges to the linear regression, while when it goes to infinity, polynomial_chaos coefficients converge to zero.

run(x, y, design_matrix)[source]

Implements the LASSO method to compute the polynomial_chaos coefficients.

Parameters:

x (ndarray) – numpy.ndarray containing the training points (samples).
y (ndarray) – numpy.ndarray containing the model evaluations (labels) at the training points.
design_matrix (ndarray) – matrix containing the evaluation of the polynomials at the input points x.

Returns:

Weights (polynomial_chaos coefficients) and Bias of the regressor

LAR Regression

Least Angle Regression [60] (known as LAR or LARS) is related to a forward stepwise model-selection algorithm and it represents an efficient algorithm for fitting a penalized model similarly to LASSO. However, LAR does not need any hyper parameter \(\lambda\) and thus it can be used for an automatic detection of the best linear regression model for given experimental design. The most correlated predictor with the quantity of interest is identified in the first step of the algorithm. Further it takes the largest possible step until some other predictor is equally correlated with the residual, and LAR continues in the direction equiangular between the two predictors.

Number of LAR steps is equal to number of unknowns, since it adds one predictor to the active set in each step. This characteristic can be utilized for the iterative algorithm selecting the most accurate surrogate model from large number of candidates [59] obtained by Least Squares using function from the LeastAngleRegression class.

LeastAngleRegression Class

The LeastAngleRegression class is imported using the following command:

>>> from UQpy.surrogates.polynomial_chaos.regressions.LeastAngleRegression import LeastAngleRegression

class LeastAngleRegression(fit_intercept=False, verbose=False, n_nonzero_coefs=1000, normalize=False)[source]

Class to select the best model approximation and calculate the polynomial_chaos coefficients with the Least Angle Regression method combined with ordinary least squares.

Parameters:

n_nonzero_coefs (int) – Maximum number of non-zero coefficients.
fit_intercept (bool) – Whether to calculate the intercept for this model. Recommended false for PCE, since intercept is included in basis functions.
verbose (bool) – Sets the verbosity amount.

run(x, y, design_matrix)[source]

Implements the LAR method to compute the polynomial_chaos coefficients. Recommended only for model_selection algorithm.

Parameters:

x (ndarray) – numpy.ndarray containing the training points (samples).
y (ndarray) – numpy.ndarray containing the model evaluations (labels) at the training points.
design_matrix (ndarray) – matrix containing the evaluation of the polynomials at the input points x.

Returns:

Beta (polynomial_chaos coefficients)

static model_selection(pce_object, target_error=1, check_overfitting=True)[source]

LARS model selection algorithm for given TargetError of approximation measured by Cross validation: Leave-one-out error (1 is perfect approximation). Option to check overfitting by empirical rule: if three steps in a row have a decreasing accuracy, stop the algorithm.

Parameters:

pce_object (PolynomialChaosExpansion) – existing target PCE for model_selection
target_error – Target error of an approximation (stoping criterion).
check_overfitting – Whether to check over-fitting by empirical rule.

Returns:

copy of input PolynomialChaosExpansion containing the best possible model for given data identified by LARs