InfoModelSelection
The InformationModelSelection
class employs information-theoretic criteria for model selection. Several simple information
theoretic criteria can be used to compute a model’s quality and perform model selection [11]. UQpy
implements three criteria:
Bayesian information criterion, \(BIC = \ln(n) k - 2 \ln(\hat{L})\)
The BIC
class is imported using the following command:
>>> from UQpy.inference.information_criteria.BIC import BIC
Akaike information criterion, \(AIC = 2 k - 2 \ln (\hat{L})\)
The AIC
class is imported using the following command:
>>> from UQpy.inference.information_criteria.AIC import AIC
Corrected formula for \(AIC (AICc)\), for small data sets , \(AICc = AIC + \frac{2k(k+1)}{n-k-1}\)
The AICc
class is imported using the following command:
>>> from UQpy.inference.information_criteria.AICc import AICc
where \(k\) is the number of parameters characterizing the model, \(\hat{L}\) is the maximum value of the likelihood function, and \(n\) is the number of data points. The best model is the one that minimizes the criterion, which is a combination of a model fit term (find the model that minimizes the negative log likelihood) and a penalty term that increases as the number of model parameters (model complexity) increases.
A probability can be defined for each model as \(P(m_{i}) \propto \exp\left( -\frac{\text{criterion}}{2} \right)\).
Note that none of the above information theoretic criteria requires any input parameters from initialization and thus their instances can be created as follows:
>>> criterion = AIC()
All of these criteria are child classes of the InformationCriterion
abstract baseclass. The user can create
new type of criteria by extending the InformationCriterion
and providing an alternative implementation to the
evaluate_criterion()
method.
The InformationCriterion
class is imported using the following command:
>>> from UQpy.inference.information_criteria.baseclass.InformationCriterion import InformationCriterion
- class InformationCriterion[source]
- abstract minimize_criterion(data, parameter_estimator, return_penalty=False)[source]
Function that must be implemented by the user in order to create new concrete implementation of the
InformationCriterion
baseclass.- Return type:
InfoModelSelection Class
The InformationModelSelection
class is imported using the following command:
>>> from UQpy.inference.InformationModelSelection import InformationModelSelection
Methods
- class InformationModelSelection(parameter_estimators, criterion=<UQpy.inference.information_criteria.AIC.AIC object>, n_optimizations=None, initial_parameters=None)[source]
Perform model selection using information theoretic criteria.
Supported criteria are
BIC
,AIC
(default),AICc
. This class leverages theMLE
class for maximum likelihood estimation, thus inputs toMLE
can also be provided toInformationModelSelection
, as lists of length equal to the number of models.- Parameters:
parameter_estimators (
list
[MLE
]) – A list containing a maximum-likelihood estimator (MLE
) for each one of the models to be compared.criterion (
InformationCriterion
) – Criterion to be used (AIC
,BIC
,AICc)
. Default isAIC
initial_parameters (
Optional
[list
[ndarray
]]) – Initial guess(es) for optimization,numpy.ndarray
of shape(nstarts, n_parameters)
or(n_parameters, )
, wherenstarts
is the number of times the optimizer will be called. Alternatively, the user can provide input n_optimizations to randomly sample initial guess(es). The identified MLE is the one that yields the maximum log likelihood over all calls of the optimizer.
- run(n_optimizations, initial_parameters=None)[source]
Run the model selection procedure, i.e. compute criterion value for all models.
This function calls the
run()
method of theMLE
object for each model to compute the maximum log-likelihood, then computes the criterion value and probability for each model. If data are given when creating theMLE
object, this method is called automatically when the object is created.- Parameters:
n_optimizations (
list
[int
]) – Number of iterations that the optimization is run, starting at random initial guesses. It is only used if initial_parameters is not provided. Default is \(1\). The random initial guesses are sampled uniformly between \(0\) and \(1\), or uniformly between user-defined bounds if an input bounds is provided as a keyword argument to the optimizer input parameter.initial_parameters (
Optional
[list
[ndarray
]]) – Initial guess(es) for optimization,numpy.ndarray
of shape(nstarts, n_parameters)
or(n_parameters, )
, wherenstarts
is the number of times the optimizer will be called. Alternatively, the user can provide input n_optimizations to randomly sample initial guess(es). The identified MLE is the one that yields the maximum log likelihood over all calls of the optimizer.
- sort_models()[source]
Sort models in descending order of model probability (increasing order of criterion value).
This function sorts - in place - the attribute lists
candidate_models
,ml_estimators
,criterion_values
,penalty_terms
andprobabilities
so that they are sorted from most probable to least probable model. It is a stand-alone function that is provided to help the user to easily visualize which model is the best.No inputs/outputs.
Attributes
-
InformationModelSelection.parameter_estimators:
list
MLE
results for each model (contains e.g. fitted parameters)