Chatterjee indices
The Chatterjee index measures the strength of the relationship between \(X\) and \(Y\) using rank statistics [47].
Consider \(n\) samples of random variables \(X\) and \(Y\), with \((X_{(1)}, Y_{(1)}), \ldots,(X_{(n)}, Y_{(n)})\) such that \(X_{(1)} \leq \cdots \leq X_{(n)}\). Here, random variable \(X\) can be one of the inputs of a model and \(Y\) be the model response. If \(X_{i}\)’s have no ties, there is a unique way of doing this (case of ties is also taken into account in the implementation, see [47]). Let \(r_{i}`\) be the rank of \(Y_{(i)}\), that is, the number of \(j\) such that \(Y_{(j)} \leq Y_{(i)}\).The Chatterjee index \(\xi_{n}(X, Y)\) is defined as:
The Chatterjee index converges for \(n \rightarrow \infty\) to the Cramér-von Mises index and is faster to estimate than using the Pick and Freeze approach to compute the the Cramér-von Mises index.
Furthermore, the Sobol indices can be efficiently estimated by leveraging the same rank statistics, which has the advantage that any sample can be used and no specific pick and freeze scheme is required.
Chatterjee Class
The ChatterjeeSensitivity class is imported using the following command:
>>> from UQpy.sensitivity.ChatterjeeSensitivity import ChatterjeeSensitivity
Methods
- class ChatterjeeSensitivity(runmodel_object, dist_object, random_state=None)[source]
Compute sensitivity indices using the Chatterjee correlation coefficient.
Using the same model evaluations, we can also estimate the Sobol indices.
- Parameters:
runmodel_object – The computational model. It should be of type
RunModel. The output QoI can be a scalar or vector of lengthny, then the sensitivity indices of allnyoutputs are computed independently.distributions – List of
Distributionobjects corresponding to each random variable, orJointIndependentobject (multivariate RV with independent marginals).random_state – Random seed used to initialize the pseudo-random number generator. Default is
None.
Methods:
- run(n_samples=1000, estimate_sobol_indices=False, n_bootstrap_samples=None, confidence_level=0.95)[source]
Compute the sensitivity indices using the Chatterjee method. Employing the
runmethod will initializen_samplessimulations usingRunModel. To compute sensitivity indices using pre-computed inputs and outputs, use the static methods described below.- Parameters:
n_samples (
int) – Number of samples used to compute the Chatterjee indices. Default is 1,000.estimate_sobol_indices (
bool) – IfTrue, the Sobol indices are estimated using the pick-and-freeze samples.n_bootstrap_samples (
Optional[int]) – Number of bootstrap samples used to estimate the Sobol indices. Default isNone.confidence_level (
float) – Confidence level used to compute the confidence intervals of the Cramér-von Mises indices.
- static compute_chatterjee_indices(X, Y, seed=None)[source]
Compute the Chatterjee sensitivity indices between the input random vectors \(X=\left[ X_{1}, X_{2},…,X_{d} \right]\) and output random vector Y.
- Parameters:
X (
ndarray) – Input random vectors,numpy.ndarrayof shape(n_samples, n_variables)Y (
ndarray) – Output random vector,numpy.ndarrayof shape(n_samples, 1)seed (
Union[None,int,RandomState]) – Seed for the random number generator.
- Returns:
Chatterjee sensitivity indices,
numpy.ndarrayof shape(n_variables, 1)
- static rank_analog_to_pickfreeze(X, j)[source]
Computing the \(N(j)\) for each \(j \in \{1, \ldots, n\}\) as in eq.(8) in [48], where \(n\) is the size of \(X\).
\begin{equation} N(j):= \begin{cases} \pi^{-1}(\pi(j)+1) &\text { if } \pi(j)+1 \leqslant n \\ \pi^{-1}(1) &\text { if } \pi(j)=n \end{cases} \end{equation}where, \(\pi(j) := \mathrm{rank}(x_j)\)
- Parameters:
X (
ndarray) – Input random vector,numpy.ndarrayof shape(n_samples, 1)j (
Integral) – Index of the sample \(j \in \{1, \ldots, n\}\)
- Returns:
\(N(j)\)
int
- static rank_analog_to_pickfreeze_vec(X)[source]
Computing the \(N(j)\) for each \(j \in \{1, \ldots, n\}\) in a vectorized manner., where \(n\) is the size of \(X\).
This method is significantly faster than the looping version
rank_analog_to_pickfreezebut is also more complicated.\begin{equation} N(j):= \begin{cases} \pi^{-1}(\pi(j)+1) &\text { if } \pi(j)+1 \leqslant n \\ \pi^{-1}(1) &\text { if } \pi(j)=n \end{cases} \end{equation}where, \(\pi(j) := \mathrm{rank}(x_j)\)
Key idea: \(\pi^{-1}\) is rank_X.argsort() ( see also)
Example: X = [22, 74, 44, 11, 1]
N_J = [3, 5, 2, 1, 4] (1-based indexing)
N_J = [2, 4, 1, 0, 3] (0-based indexing)
- Parameters:
X (
ndarray) – Input random vector,numpy.ndarrayof shape(n_samples, 1)- Returns:
\(N(j)\),
numpy.ndarrayof shape(n_samples, 1)
- static compute_Sobol_indices(A_model_evals, C_i_model_evals)[source]
A method to estimate the first order Sobol indices using the Chatterjee method.
\begin{equation} \xi_{n}^{\mathrm{Sobol}}\left(X_{1}, Y\right):= \frac{\frac{1}{n} \sum_{j=1}^{n} Y_{j} Y_{N(j)}-\left(\frac{1}{n} \sum_{j=1}^{n} Y_{j}\right)^{2}} {\frac{1}{n} \sum_{j=1}^{n}\left(Y_{j}\right)^{2}-\left(\frac{1}{n} \sum_{j=1}^{n} Y_{j}\right)^{2}} \end{equation}where the term \(Y_{N(j)}\) is computed using the method:
rank_analog_to_pickfreeze_vec.- Parameters:
A_model_evals (
ndarray) – Model evaluations,numpy.ndarrayof shape(n_samples, 1)C_i_model_evals (
ndarray) – Model evaluations,numpy.ndarrayof shape(n_samples, n_variables)
- Returns:
First order Sobol indices,
numpy.ndarrayof shape(n_variables, 1)
Attributes
- ChatterjeeSensitivity.first_order_chatterjee_indices
Chatterjee sensitivity indices (First order),
numpy.ndarrayof shape(n_variables, 1)
- ChatterjeeSensitivity.first_order_sobol_indices
Sobol indices computed using the rank statistics,
numpy.ndarrayof shape(n_variables, 1)
- ChatterjeeSensitivity.confidence_interval_chatterjee
Confidence intervals for the Chatterjee sensitivity indices,
numpy.ndarrayof shape(n_variables, 2)