easyvvuq.analysis package

Submodules

easyvvuq.analysis.base module

Provides a base class for all analysis elements.

class easyvvuq.analysis.base.BaseAnalysisElement

Bases: easyvvuq.base_element.BaseElement

Base class for all EasyVVUQ analysis elements.

analyse(data_frame=None)

Perform analysis on input data_frame.

Parameters:data_frame (pandas DataFrame) – Input data for analysis.
Returns:
Return type:AnalysisResults instance
element_category()

Element type for logging and verification.

Returns:Element category.
Return type:str
element_name()

Name for this element for logging purposes.

Returns:Element name.
Return type:str
element_version()

Version of this element for logging purposes.

Returns:Element version.
Return type:str

easyvvuq.analysis.basic_stats module

Provides analysis element for basic statistical analysis.

The analysis is based on pandas.DataFrame.describe() function.

class easyvvuq.analysis.basic_stats.BasicStats(groupby=None, qoi_cols=None)

Bases: easyvvuq.analysis.base.BaseAnalysisElement

analyse(data_frame=None)

Perform the basis stats analysis on the input data_frame.

Analysis is based on pandas.Dataframe.describe and results in values for: count, mean, std, min, max and 25%, 50% & 75% percentiles for each value in the analysis.

The data_frame is grouped according to self.groupby if specified and analysis is performed on the columns selected in self.qoi_cols if set.

Parameters:data_frame (pandas.DataFrame) – Summary data produced through collation of simulation output.
Returns:Basic statistic for selected columns and groupings of data.
Return type:pandas.DataFrame
element_name()

Name for this element for logging purposes

element_version()

Version of this element for logging purposes

easyvvuq.analysis.ensemble_boot module

Provides analysis element for ensemble bootstrapping analysis.

class easyvvuq.analysis.ensemble_boot.EnsembleBoot(groupby=[], qoi_cols=[], stat_func=<function mean>, alpha=0.05, sample_size=None, n_boot_samples=1000, pivotal=False, stat_name='boot')

Bases: easyvvuq.analysis.base.BaseAnalysisElement

analyse(data_frame=None)

Perform bootstrapping analysis on the input data_frame.

The data_frame is grouped according to self.groupby if specified and analysis is performed on the columns selected in self.qoi_cols if set.

Parameters:data_frame (pandas.DataFrame) – Summary data produced through collation of simulation output.
Returns:Basic statistic for selected columns and groupings of data.
Return type:pandas.DataFrame
element_name()

Name for this element for logging purposes

element_version()

Version of this element for logging purposes

easyvvuq.analysis.ensemble_boot.bootstrap(data, stat_func, alpha=0.05, sample_size=None, n_samples=1000, pivotal=False)
Parameters:
  • data (pandas.DataFrame) – Input data to be analysed.
  • stat_func (function) – Statistical function to be applied to data for bootstrapping.
  • alpha (float) – Produce estimate of 100.0*(1-alpha) confidence interval.
  • sample_size (int) – Size of the sample to be drawn from the input data.
  • n_samples (int) – Number of times samples are to be drawn from the input data.
  • pivotal (bool) – Use the pivotal method? Default to percentile method.
Returns:

  • float – Value of the bootstrap statistic
  • float – Highest value of the confidence interval
  • float – Lowest value of the confidence interval

easyvvuq.analysis.ensemble_boot.confidence_interval(dist, value, alpha, pivotal=False)

Get the bootstrap confidence interval for a given distribution.

Parameters:
  • dist – Array containing distribution of bootstrap results.
  • value – Value of statistic for which we are calculating error bars.
  • alpha – The alpha value for the confidence intervals.
  • pivotal – Use the pivotal method? Default to percentile method.
Returns:

  • float – Value of the bootstrap statistic
  • float – Highest value of the confidence interval
  • float – Lowest value of the confidence interval

easyvvuq.analysis.ensemble_boot.ensemble_bootstrap(data, groupby=[], qoi_cols=[], stat_func=<function mean>, alpha=0.05, sample_size=None, n_samples=1000, pivotal=False, stat_name='boot')

Perform bootstrapping analysis on input data.

Parameters:
  • data (pandas.DataFrame) – DataFrame to be analysed.
  • groupby (list or None) – Columns to use to group the data in analyse method before calculating stats.
  • qoi_cols (list or None) – Columns of quantities of interest (for which stats will be calculated).
  • stat_func (function) – Statistical function to be applied to data for bootstrapping.
  • alpha (float, default=0.05) – Produce estimate of 100.0*(1-alpha) confidence interval.
  • sample_size (int) – Size of the sample to be drawn from the input data.
  • n_samples (int, default=1000) – Number of times samples are to be drawn from the input data.
  • pivotal (bool, default=False) – Use the pivotal method? Default to percentile method.
  • stat_name (str, default=’boot’) – Name to use to describe columns containing output statistic (for example ‘mean’).
Returns:

Description of input data using bootstrap statistic and high/low confidence intervals.

Return type:

pandas.DataFrame

easyvvuq.analysis.gp_analyse module

class easyvvuq.analysis.gp_analyse.GaussianProcessSurrogate(attr_cols, target_cols)

Bases: easyvvuq.analysis.base.BaseAnalysisElement

analyse(data_frame=None)

Perform the basis stats analysis on the input data_frame.

Analysis is based on pandas.Dataframe.describe and results in values for: count, mean, std, min, max and 25%, 50% & 75% percentiles for each value in the analysis.

The data_frame is grouped according to self.groupby if specified and analysis is performed on the columns selected in self.qoi_cols if set.

Parameters:data_frame (pandas.DataFrame) – Summary data produced through collation of simulation output.
Returns:Basic statistic for selected columns and groupings of data.
Return type:pandas.DataFrame
element_name()

Name for this element for logging purposes

element_version()

Version of this element for logging purposes

easyvvuq.analysis.pce_analysis module

Analysis element for polynomial chaos expansion (PCE). We use ChaosPy under the hood for this functionality.

class easyvvuq.analysis.pce_analysis.PCEAnalysis(sampler=None, qoi_cols=None, sampling=False)

Bases: easyvvuq.analysis.base.BaseAnalysisElement

analyse(data_frame=None)

Perform PCE analysis on input data_frame.

Parameters:data_frame (pandas DataFrame) – Input data for analysis.
Returns:Use it to get the sobol indices and other information.
Return type:PCEAnalysisResults
element_name()

Name for this element for logging purposes.

Returns:“PCE_Analysis”
Return type:str
element_version()

Version of this element for logging purposes.

Returns:Element version.
Return type:str
class easyvvuq.analysis.pce_analysis.PCEAnalysisResults(raw_data=None, samples=None, qois=None, inputs=None)

Bases: easyvvuq.analysis.qmc_analysis.QMCAnalysisResults

Analysis results for the PCEAnalysis class.

easyvvuq.analysis.qmc_analysis module

Analysis element for Quasi-Monte Carlo (QMC) sensitivity analysis.

Please refer to the article below for the basic approach used here. https://en.wikipedia.org/wiki/Variance-based_sensitivity_analysis

class easyvvuq.analysis.qmc_analysis.QMCAnalysis(sampler, qoi_cols=None)

Bases: easyvvuq.analysis.base.BaseAnalysisElement

analyse(data_frame)

Perform QMC analysis on a given pandas DataFrame.

Parameters:data_frame (pandas DataFrame) – Input data for analysis.
Returns:AnalysisResults object for QMC.
Return type:easyvvuq.analysis.qmc.QMCAnalysisResults
element_name()

Name for this element.

Returns:“QMC_Analysis”
Return type:str
element_version()

Version of this element.

Returns:Element version.
Return type:str
get_samples(data_frame)

Converts the Pandas dataframe into a dictionary.

Parameters:data_frame (pandas DataFrame) – the EasyVVUQ Pandas dataframe from collation.
Returns:A dictionary with the QoI names as keys. Each element is a list of code evaluations.
Return type:dict
sobol_bootstrap(samples, alpha=0.05, n_samples=1000)

Computes the first order and total order Sobol indices using Saltelli’s method. To assess the sampling inaccuracy, bootstrap confidence intervals are also computed.

Reference: A. Saltelli, Making best use of model evaluations to compute sensitivity indices, Computer Physics Communications, 2002.

Parameters:
  • samples (list) – The samples for a given QoI.
  • alpha (float) – The (1 - alpha) * 100 confidence interval parameter. The default is 0.05.
  • n_samples (int) – The number of bootstrap samples. The default is 1000.
Returns:

  • sobols_first_dict, conf_first_dict, sobols_total_dict, conf_total_dict
  • dictionaries containing the first- and total-order Sobol indices for all
  • parameters, and (1-alpha)*100 lower and upper confidence bounds.

class easyvvuq.analysis.qmc_analysis.QMCAnalysisResults(raw_data=None, samples=None, qois=None, inputs=None)

Bases: easyvvuq.analysis.results.AnalysisResults

Analysis results for the QMCAnalysis Method. Refer to the AnalysisResults base class documentation for details on using it.

easyvvuq.analysis.sc_analysis module

class easyvvuq.analysis.sc_analysis.SCAnalysis(sampler=None, qoi_cols=None)

Bases: easyvvuq.analysis.base.BaseAnalysisElement

SC2PCE(samples, verbose=True, **kwargs)

Computes the Polynomials Chaos Expansion coefficients from the SC expansion via a transformation of basis (Lagrange polynomials basis –> orthonomial basis).

Parameters:samples (array of SC code samples from which to compute the PCE coefficients)
Returns:pce_coefs
Return type:dict of PCE coefficients per multi index l
adapt_dimension(qoi, data_frame, store_stats_history=True, method='surplus', **kwargs)

Compute the adaptation metric and decide which of the admissible level indices to include in next iteration of the sparse grid. The adaptation metric is based on the hierarchical surplus, defined as the difference between the new code values of the admissible level indices, and the SC surrogate of the previous iteration. Alternatively, it can be based on the difference between the output mean of the current level, and the mean computed with one extra admissible index.

This subroutine must be called AFTER the code is evaluated at the new points, but BEFORE the analysis is performed.

Parameters:
  • - qoi ((string) the name of the quantity of interest which is used) – to base the adaptation metric on.
  • - data_frame (the data frame from the EasyVVUQ Campaign)
  • - store_stats_history (boolean, default=True) (store the mean and variance) – at each refinement in self.mean_history and self.std_history. Used for checking convergence in the stattistics over the refinement iterations
  • - method (string) (name of the refinement error, default is ‘surplus’.) – In this case the error is based on the hierarchical surplus, which is an interpolation based error. Other possibilities are ‘mean’ and ‘var’, in which case the error is based on the difference in the mean or variance between the current estimate and the estimate obtained when a particular candidate direction is added.
Returns:

Return type:

None.

adaptation_histogram()
Parameters:None
Returns:
  • Plots a bar chart of the maximum order of the quadrature rule
  • that is used in each dimension. Use in case of the dimension adaptive
  • sampler to get an idea of which parameters were more refined than others.
  • This gives only a first-order idea, as it only plots the max quad
  • order independently per input parameter, so higher-order refinements
  • that were made do not show up in the bar chart.
adaptation_table(**kwargs)

Plots a color-coded table of the quadrature-order refinement. Shows in what order the parameters were refined, and unlike adaptation_histogram, this also shows higher-order refinements.

Parameters:
  • **kwargs (can contain kwarg ‘order’ to specify the order in which)
  • the variables on the x axis are plotted (e.g. in order of decreasing
  • 1st order Sobol index).
Returns:

Return type:

None.

analyse(data_frame=None, compute_moments=True, compute_Sobols=True)

Perform SC analysis on input data_frame.

Parameters:data_frame (pandas.DataFrame) – Input data for analysis.
Returns:Results dictionary with sub-dicts with keys: [‘statistical_moments’, ‘sobol_indices’]. Each dict has an entry for each item in qoi_cols.
Return type:dict
combination_technique(qoi, samples=None, **kwargs)

Efficient quadrature formulation for (sparse) grids. See:

Gerstner, Griebel, “Numerical integration using sparse grids” Uses the general combination technique (page 12).
Parameters:
  • - qoi (str) (name of the qoi)
  • - samples (optional in kwargs) (Default: compute the mean) – by setting samples = self.samples. To compute the variance, set samples = (self.samples - mean)**2
compute_comb_coef(**kwargs)

Compute general combination coefficients. These are the coefficients multiplying the tensor products associated to each multi index l, see page 12 Gerstner & Griebel, numerical integration using sparse grids

compute_marginal(qoi, u, u_prime, diff)

Computes a marginal integral of the qoi(x) over the dimension defined by u_prime, for every x value in dimensions u

Parameters:
  • - qoi (str) (name of the quantity of interest)
  • - u (array of int) (dimensions which are not integrated)
  • - u_prime (array of int) (dimensions which are integrated)
  • - diff (array of int) (levels)
  • Returns
  • - Values of the marginal integral
  • ——-
static compute_tensor_prod_u(xi, wi, u, u_prime)

Calculate tensor products of weights and collocation points with dimension of u and u’

Parameters:
  • xi (array of floats) (1D colloction points)
  • wi (array of floats) (1D quadrature weights)
  • u (array of int) (dimensions)
  • u_prime (array of int) (remaining dimensions (u union u’ = range(N)))
  • Returns
  • dict of tensor products of weight and points for dimensions u and u’
  • ——-
element_name()

Name for this element for logging purposes

element_version()

Version of this element for logging purposes

get_adaptation_errors()

Returns self.adaptation_errors

get_moments(qoi)
Parameters:- qoi (str) (name of the qoi)
Returns:
Return type:
  • mean and variance of qoi (float (N_qoi,))
get_pce_sobol_indices(qoi, typ='first_order', **kwargs)

Computes Sobol indices using Polynomials Chaos coefficients. These coefficients are computed from the SC expansion via a transformation of basis (SC2PCE subroutine). This works better than computing the Sobol indices directly from the SC expansion in the case of the dimension-adaptive sampler.

Method: J.D. Jakeman et al, “Adaptive multi-index collocation for uncertainty quantification and sensitivity analysis”, 2019. (Page 18)

Parameters:
  • - qoi (str) (name of the Quantity of Interest for which to compute the indices)
  • - typ (str) (Default = ‘first_order’. ‘all’ is also possible)
  • - **kwargs (if this contains ‘samples’, use these instead of the SC samples ]) – in the database
Returns:

  • - Mean (PCE mean)
  • - Var (PCE variance)
  • - S_u (PCE Sobol indices, either the first order indices or all indices)

get_pce_stats(l_norm, pce_coefs, comb_coef)

Compute the mean and the variance based on the PCE coefficients

Parameters:
  • - l_norm (array) (array of quadrature order multi indices)
  • - pce_coefs (tuple) (tuple of PCE coefficients computed by SC2PCE subroutine)
  • - comb_coef (tuple) (tuple of combination coefficients computed by compute_comb_coef)
Returns:

Return type:

  • mean and variance based on the PCE coefficients

get_sample_array(qoi)
Parameters:- qoi (str) (name of quantity of interest)
Returns:
Return type:
  • array of all samples of qoi
get_sobol_indices(qoi, typ='first_order')

Computes Sobol indices using Stochastic Collocation. Method: Tang (2009), GLOBAL SENSITIVITY ANALYSIS FOR STOCHASTIC COLLOCATION EXPANSION.

Parameters:
  • qoi (str) (name of the Quantity of Interest for which to compute the indices)
  • typ (str) (Default = ‘first_order’. ‘all’ is also possible)
Returns:

Return type:

Either the first order or all Sobol indices of qoi

get_uncertainty_amplification(qoi)

Computes a measure that signifies the ratio of output to input uncertainty. It is computed as the (mean) Coefficient of Variation (V) of the output divided by the (mean) CV of the input.

Parameters:qoi (string) (name of the Quantity of Interest)
Returns:blowup (float)
Return type:the ratio output CV / input CV
load_state(filename)

Loads the complete state of the analysis object from a pickle file, stored using save_state.

Parameters:filename ((string) name of the file to load)
Returns:
Return type:None.
merge_accepted_and_admissible(level=0, **kwargs)

In the case of the dimension-adaptive sampler, there are 2 sets of quadrature multi indices. There are the accepted indices that are actually used in the analysis, and the admissible indices, of which some might move to the accepted set in subsequent iterations. This subroutine merges the two sets of multi indices by moving all admissible to the set of accepted indices. Do this at the end, when no more refinements will be executed. The samples related to the admissble indices are already computed, although not used in the analysis. By executing this subroutine at very end, all computed samples are used during the final postprocessing stage. Execute campaign.apply_analysis to let the new set of indices take effect. If further refinements are executed after all via sampler.look_ahead, the number of new admissible samples to be computed can be very high, especially in high dimensions. It is possible to undo the merge via analysis.undo_merge before new refinements are made. Execute campaign.apply_analysis again to let the old set of indices take effect.

plot_grid()

Plots the collocation points for 2 and 3 dimensional problems

plot_stat_convergence()

Plots the convergence of the statistical mean and std dev over the different refinements in a dimension-adaptive setting. Specifically the inf norm of the difference between the stats of iteration i and iteration i-1 is plotted.

Returns:
Return type:None.
quadrature(qoi, samples=None)

Computes a (Smolyak) quadrature

Parameters:
  • - qoi (str) (name of the qoi)
  • - samples (Default: compute the mean) – by setting samples = self.samples. To compute the variance, set samples = (self.samples - mean)**2
  • Returns (the quadrature of qoi)
  • ——-
save_state(filename)

Saves the complete state of the analysis object to a pickle file, except the sampler object (self.samples).

Parameters:filename ((string) name to the file to write the state to)
Returns:
Return type:None.
sc_expansion(samples, x)

Non recursive implementation of the SC expansion. (Default setting of surrogate) ————————————————- Performs interpolation for both full and sparse grids.

Parameters:
  • - samples (array of code samples)
  • - x (float (N,)) (location in stochastic space at which to eval) – the surrogate
Returns:

surr (float, (N_qoi,))

Return type:

the interpolated value of qoi at x

surrogate(qoi, x, L=None)

Use sc_expansion UQP as a surrogate

Parameters:
  • - qoi (str) (name of the qoi)
  • - x (array) (location at which to evaluate the surrogate)
  • - L (int) (level of the (sparse) grid, default = self.L)
Returns:

Return type:

the interpolated value of qoi at x (float, (N_qoi,))

undo_merge()

This reverses the effect of the merge_accepted_and_admissble subroutine. Execute if further refinement are required after all.

class easyvvuq.analysis.sc_analysis.SCAnalysisResults(raw_data=None, samples=None, qois=None, inputs=None)

Bases: easyvvuq.analysis.results.AnalysisResults

easyvvuq.analysis.sc_analysis.lagrange_poly(x, x_i, j)

Lagrange polynomials used for interpolation

l_j(x) = product(x - x_m / x_j - x_m) with 0 <= m <= k
and m !=j
Parameters:
  • x ((float), location at which to compute the polynomial)
  • x_i (list or array of float, nodes of the Lagrange polynomials)
  • j (int, index of node at which l_j(x_j) = 1)
Returns:

l_j(x) calculated as shown above.

Return type:

float

easyvvuq.analysis.sc_analysis.powerset([1,2,3]) --> () (1,) (2,) (3,) (1,2) (1,3) (2,3) (1,2,3)

Taken from: https://docs.python.org/3/library/itertools.html#recipes

Parameters:iterable (iterable) – Input sequence
easyvvuq.analysis.sc_analysis.setdiff2d(X, Y)

Computes the difference of two 2D arrays X and Y

Parameters:
  • X (2D numpy array)
  • Y (2D numpy array)
Returns:

Return type:

The difference X Y as a 2D array

Module contents