InfoModelSelection
The InformationModelSelection class employs information-theoretic criteria for model selection. Several simple information
theoretic criteria can be used to compute a model’s quality and perform model selection [11]. UQpy implements three criteria:
Bayesian information criterion, \(BIC = \ln(n) k - 2 \ln(\hat{L})\)
The BIC class is imported using the following command:
>>> from UQpy.inference.information_criteria.BIC import BIC
Akaike information criterion, \(AIC = 2 k - 2 \ln (\hat{L})\)
The AIC class is imported using the following command:
>>> from UQpy.inference.information_criteria.AIC import AIC
Corrected formula for \(AIC (AICc)\), for small data sets , \(AICc = AIC + \frac{2k(k+1)}{n-k-1}\)
The AICc class is imported using the following command:
>>> from UQpy.inference.information_criteria.AICc import AICc
where \(k\) is the number of parameters characterizing the model, \(\hat{L}\) is the maximum value of the likelihood function, and \(n\) is the number of data points. The best model is the one that minimizes the criterion, which is a combination of a model fit term (find the model that minimizes the negative log likelihood) and a penalty term that increases as the number of model parameters (model complexity) increases.
A probability can be defined for each model as \(P(m_{i}) \propto \exp\left( -\frac{\text{criterion}}{2} \right)\).
Note that none of the above information theoretic criteria requires any input parameters from initialization and thus their instances can be created as follows:
>>> criterion = AIC()
All of these criteria are child classes of the InformationCriterion abstract baseclass. The user can create
new type of criteria by extending the InformationCriterion and providing an alternative implementation to the
evaluate_criterion() method.
The InformationCriterion class is imported using the following command:
>>> from UQpy.inference.information_criteria.baseclass.InformationCriterion import InformationCriterion
- class InformationCriterion[source]
- abstract minimize_criterion(data, parameter_estimator, return_penalty=False)[source]
Function that must be implemented by the user in order to create new concrete implementation of the
InformationCriterionbaseclass.- Return type:
InfoModelSelection Class
The InformationModelSelection class is imported using the following command:
>>> from UQpy.inference.InformationModelSelection import InformationModelSelection
Methods
- class InformationModelSelection(parameter_estimators, criterion=<UQpy.inference.information_criteria.AIC.AIC object>, n_optimizations=None, initial_parameters=None)[source]
Perform model selection using information theoretic criteria.
Supported criteria are
BIC,AIC(default),AICc. This class leverages theMLEclass for maximum likelihood estimation, thus inputs toMLEcan also be provided toInformationModelSelection, as lists of length equal to the number of models.- Parameters:
parameter_estimators (
list[MLE]) – A list containing a maximum-likelihood estimator (MLE) for each one of the models to be compared.criterion (
InformationCriterion) – Criterion to be used (AIC,BIC,AICc). Default isAICinitial_parameters (
Optional[list[ndarray]]) – Initial guess(es) for optimization,numpy.ndarrayof shape(nstarts, n_parameters)or(n_parameters, ), wherenstartsis the number of times the optimizer will be called. Alternatively, the user can provide input n_optimizations to randomly sample initial guess(es). The identified MLE is the one that yields the maximum log likelihood over all calls of the optimizer.
- run(n_optimizations, initial_parameters=None)[source]
Run the model selection procedure, i.e. compute criterion value for all models.
This function calls the
run()method of theMLEobject for each model to compute the maximum log-likelihood, then computes the criterion value and probability for each model. If data are given when creating theMLEobject, this method is called automatically when the object is created.- Parameters:
n_optimizations (
list[int]) – Number of iterations that the optimization is run, starting at random initial guesses. It is only used if initial_parameters is not provided. Default is \(1\). The random initial guesses are sampled uniformly between \(0\) and \(1\), or uniformly between user-defined bounds if an input bounds is provided as a keyword argument to the optimizer input parameter.initial_parameters (
Optional[list[ndarray]]) – Initial guess(es) for optimization,numpy.ndarrayof shape(nstarts, n_parameters)or(n_parameters, ), wherenstartsis the number of times the optimizer will be called. Alternatively, the user can provide input n_optimizations to randomly sample initial guess(es). The identified MLE is the one that yields the maximum log likelihood over all calls of the optimizer.
- sort_models()[source]
Sort models in descending order of model probability (increasing order of criterion value).
This function sorts - in place - the attribute lists
candidate_models,ml_estimators,criterion_values,penalty_termsandprobabilitiesso that they are sorted from most probable to least probable model. It is a stand-alone function that is provided to help the user to easily visualize which model is the best.No inputs/outputs.
Attributes
-
InformationModelSelection.parameter_estimators:
list MLEresults for each model (contains e.g. fitted parameters)