Chatterjee indices

The Chatterjee index measures the strength of the relationship between \(X\) and \(Y\) using rank statistics [36].

Consider \(n\) samples of random variables \(X\) and \(Y\), with \((X_{(1)}, Y_{(1)}), \ldots,(X_{(n)}, Y_{(n)})\) such that \(X_{(1)} \leq \cdots \leq X_{(n)}\). Here, random variable \(X\) can be one of the inputs of a model and \(Y\) be the model response. If \(X_{i}\)’s have no ties, there is a unique way of doing this (case of ties is also taken into account in the implementation, see [36]). Let \(r_{i}`\) be the rank of \(Y_{(i)}\), that is, the number of \(j\) such that \(Y_{(j)} \leq Y_{(i)}\).The Chatterjee index \(\xi_{n}(X, Y)\) is defined as:

\[\xi_{n}(X, Y):=1-\frac{3 \sum_{i=1}^{n-1}\left|r_{i+1}-r_{i}\right|}{n^{2}-1}\]

The Chatterjee index converges for \(n \rightarrow \infty\) to the Cramér-von Mises index and is faster to estimate than using the Pick and Freeze approach to compute the the Cramér-von Mises index.

Furthermore, the Sobol indices can be efficiently estimated by leveraging the same rank statistics, which has the advantage that any sample can be used and no specific pick and freeze scheme is required.

Chatterjee Class

The ChatterjeeSensitivity class is imported using the following command:

>>> from UQpy.sensitivity.ChatterjeeSensitivity import ChatterjeeSensitivity

Methods

class ChatterjeeSensitivity(runmodel_object, dist_object, random_state=None)[source]

Compute sensitivity indices using the Chatterjee correlation coefficient.

Using the same model evaluations, we can also estimate the Sobol indices.

Parameters:
  • runmodel_object – The computational model. It should be of type RunModel. The output QoI can be a scalar or vector of length ny, then the sensitivity indices of all ny outputs are computed independently.

  • distributions – List of Distribution objects corresponding to each random variable, or JointIndependent object (multivariate RV with independent marginals).

  • random_state – Random seed used to initialize the pseudo-random number generator. Default is None.

Methods:

run(n_samples=1000, estimate_sobol_indices=False, n_bootstrap_samples=None, confidence_level=0.95)[source]

Compute the sensitivity indices using the Chatterjee method. Employing the run method will initialize n_samples simulations using RunModel. To compute sensitivity indices using pre-computed inputs and outputs, use the static methods described below.

Parameters:
  • n_samples (int) – Number of samples used to compute the Chatterjee indices. Default is 1,000.

  • estimate_sobol_indices (bool) – If True, the Sobol indices are estimated using the pick-and-freeze samples.

  • n_bootstrap_samples (Optional[int]) – Number of bootstrap samples used to estimate the Sobol indices. Default is None.

  • confidence_level (float) – Confidence level used to compute the confidence intervals of the Cramér-von Mises indices.

static compute_chatterjee_indices(X, Y, seed=None)[source]

Compute the Chatterjee sensitivity indices between the input random vectors \(X=\left[ X_{1}, X_{2},…,X_{d} \right]\) and output random vector Y.

Parameters:
Returns:

Chatterjee sensitivity indices, numpy.ndarray of shape (n_variables, 1)

static rank_analog_to_pickfreeze(X, j)[source]

Computing the \(N(j)\) for each \(j \in \{1, \ldots, n\}\) as in eq.(8) in [37], where \(n\) is the size of \(X\).

\begin{equation} N(j):= \begin{cases} \pi^{-1}(\pi(j)+1) &\text { if } \pi(j)+1 \leqslant n \\ \pi^{-1}(1) &\text { if } \pi(j)=n \end{cases} \end{equation}

where, \(\pi(j) := \mathrm{rank}(x_j)\)

Parameters:
  • X (ndarray) – Input random vector, numpy.ndarray of shape (n_samples, 1)

  • j (Integral) – Index of the sample \(j \in \{1, \ldots, n\}\)

Returns:

\(N(j)\) int

static rank_analog_to_pickfreeze_vec(X)[source]

Computing the \(N(j)\) for each \(j \in \{1, \ldots, n\}\) in a vectorized manner., where \(n\) is the size of \(X\).

This method is significantly faster than the looping version rank_analog_to_pickfreeze but is also more complicated.

\begin{equation} N(j):= \begin{cases} \pi^{-1}(\pi(j)+1) &\text { if } \pi(j)+1 \leqslant n \\ \pi^{-1}(1) &\text { if } \pi(j)=n \end{cases} \end{equation}

where, \(\pi(j) := \mathrm{rank}(x_j)\)

Key idea: \(\pi^{-1}\) is rank_X.argsort() ( see also)

Example: X = [22, 74, 44, 11, 1]

N_J = [3, 5, 2, 1, 4] (1-based indexing)

N_J = [2, 4, 1, 0, 3] (0-based indexing)

Parameters:

X (ndarray) – Input random vector, numpy.ndarray of shape (n_samples, 1)

Returns:

\(N(j)\), numpy.ndarray of shape (n_samples, 1)

static compute_Sobol_indices(A_model_evals, C_i_model_evals)[source]

A method to estimate the first order Sobol indices using the Chatterjee method.

\begin{equation} \xi_{n}^{\mathrm{Sobol}}\left(X_{1}, Y\right):= \frac{\frac{1}{n} \sum_{j=1}^{n} Y_{j} Y_{N(j)}-\left(\frac{1}{n} \sum_{j=1}^{n} Y_{j}\right)^{2}} {\frac{1}{n} \sum_{j=1}^{n}\left(Y_{j}\right)^{2}-\left(\frac{1}{n} \sum_{j=1}^{n} Y_{j}\right)^{2}} \end{equation}

where the term \(Y_{N(j)}\) is computed using the method:rank_analog_to_pickfreeze_vec.

Parameters:
Returns:

First order Sobol indices, numpy.ndarray of shape (n_variables, 1)

Attributes

ChatterjeeSensitivity.first_order_chatterjee_indices

Chatterjee sensitivity indices (First order), numpy.ndarray of shape (n_variables, 1)

ChatterjeeSensitivity.first_order_sobol_indices

Sobol indices computed using the rank statistics, numpy.ndarray of shape (n_variables, 1)

ChatterjeeSensitivity.confidence_interval_chatterjee

Confidence intervals for the Chatterjee sensitivity indices, numpy.ndarray of shape (n_variables, 2)

ChatterjeeSensitivity.n_variables

Number of input random variables, int

ChatterjeeSensitivity.n_samples

Number of samples used to estimate the sensitivity indices, int

Examples