Stratified Sampling
Stratified sampling is a variance reduction technique that divides the parameter space into a set of disjoint and space-filling strata. Samples are then drawn from these strata in order to improve the space-filling properties of the sample design. Stratified sampling allows for unequally weighted samples, such that a Monte Carlo estimator of the quantity \(E[Y]\) takes the following form:
where \(w_i\) are the sample weights and \(Y_i\) are the model evaluations. The individual sample weights are computed as:
where \(V_{i}\le 1\) is the volume of stratum \(i\) in the unit hypercube (i.e. the probability that a random sample will fall in stratum \(i\)) and \(N_{i}\) is the number of samples drawn from stratum \(i\).
StratifiedSampling Class
The TrueStratifiedSampling
class is the parent class for stratified sampling. The various
TrueStratifiedSampling
classes generate random samples from a specified probability distribution(s) using
stratified sampling with strata specified by an object of one of the Strata
classes.
The StratifiedSampling
class is imported using the following command:
>>> from UQpy.sampling.stratified_sampling.refinement.StratifiedSampling import StratifiedSampling
Methods
- class TrueStratifiedSampling(distributions, strata_object, nsamples_per_stratum=None, nsamples=None, random_state=None)[source]
Class for Stratified Sampling ([18]).
- Parameters:
distributions (
Union
[DistributionContinuous1D
,JointIndependent
,list
[DistributionContinuous1D
]]) – List ofDistribution
objects corresponding to each random variable.strata_object (
Strata
) – Defines the stratification of the unit hypercube. This must be provided and must be an object of aStrata
child class:Rectangular
,Voronoi
, orDelaunay
.nsamples_per_stratum (
Union
[int
,list
[int
],None
]) – Specifies the number of samples in each stratum. This must be either an integer, in which case an equal number of samples are drawn from each stratum, or a list. If it is provided as a list, the length of the list must be equal to the number of strata. If nsamples_per_stratum is provided when the class is defined, therun()
method will be executed automatically. If neither nsamples_per_stratum or nsamples are provided when the class is defined, the user must call therun()
method to perform stratified sampling.nsamples (
Optional
[int
]) – Specify the total number of samples. If nsamples is specified, the samples will be drawn in proportion to the volume of the strata. Thus, each stratum will containround(V_i* nsamples)
samples. If nsamples is provided when the class is defined, therun()
method will be executed automatically. If neither nsamples_per_stratum or nsamples are provided when the class is defined, the user must call therun()
method to perform stratified sampling.random_state (
Union
[None
,int
,RandomState
]) – Random seed used to initialize the pseudo-random number generator. Default isNone
. If anint
is provided, this sets the seed for an object ofnumpy.random.RandomState
. Otherwise, the object itself can be passed directly.
- transform_samples(samples01)[source]
Transform samples in the unit hypercube \([0, 1]^n\) to the prescribed distribution using the inverse CDF.
- Parameters:
samples01 –
numpy.ndarray
containing the generated samples on \([0, 1]^n\).- Returns:
numpy.ndarray
containing the generated samples following the prescribed distribution.
- run(nsamples_per_stratum=None, nsamples=None)[source]
Executes stratified sampling.
This method performs the sampling for each of the child classes by running two methods:
create_samplesu01()
, andtransform_samples()
. Thecreate_samplesu01()
method is unique to each child class and therefore must be overwritten when a new child class is defined. Thetransform_samples()
method is common to all stratified sampling classes and is therefore defined by the parent class. It does not need to be modified.If nsamples or nsamples_per_stratum is provided when the class is defined, the
run()
method will be executed automatically. If neither nsamples_per_stratum or nsamples are provided when the class is defined, the user must call therun()
method to perform stratified sampling.- Parameters:
nsamples_per_stratum (
Union
[None
,int
,list
[int
]]) – Specifies the number of samples in each stratum. This must be either an integer, in which case an equal number of samples are drawn from each stratum, or a list. If it is provided as a list, the length of the list must be equal to the number of strata. If nsamples_per_stratum is provided when the class is defined, therun()
method will be executed automatically. If neither nsamples_per_stratum or nsamples are provided when the class is defined, the user must call therun()
method to perform stratified sampling.nsamples (
Optional
[int
]) – Specify the total number of samples. If nsamples is specified, the samples will be drawn in proportion to the volume of the strata. Thus, each stratum will containround(V_i*nsamples)
samples where \(V_i \le 1\) is the volume of stratum i in the unit hypercube. If nsamples is provided when the class is defined, therun()
method will be executed automatically. If neither nsamples_per_stratum or nsamples are provided when the class is defined, the user must call therun()
method to perform stratified sampling.
Attributes
Examples
UQpy
supports several stratified sampling variations that vary from conventional stratified sampling designs
to advanced gradient informed methods for adaptive stratified sampling. These class structures facilitate a highly flexible and varied range of stratified
sampling designs that can be extended in a straightforward way. Specifically, the existing classes allow stratification
of n-dimensional parameter spaces based on three common spatial discretizations: a rectilinear decomposition into
hyper-rectangles (orthotopes), a Voronoi decomposition, and a Delaunay decomposition. This structure is based on three classes:
1. The Strata
class defines the geometric structure of the stratification of the parameter space and it has
three existing subclasses - Rectangular
, Voronoi
, and Delaunay
that correspond to
geometric decompositions of the parameter space based on rectilinear strata of orthotopes, strata composed of Voronoi
cells, and strata composed of Delaunay simplexes respectively. These classes live in the UQpy.sampling.stratified_sampling.strata
folder.
The
TrueStratifiedSampling
class defines a set of subclasses used to draw samples from strata defined by aStrata
class object.The
RefinedStratifiedSampling
class defines a set of subclasses for refinement ofTrueStratifiedSampling
stratified sampling designs.