Application of Latin Hypercube Sampling Based Kriging Surrogate Models in Reliability Assessment

Reliability assessment is one of the necessary and critical parts in structural design under uncertainties. The methods for structural reliability assessment aim at evaluating the probability of limit state by considering the fluctuation of acting loads, variation of structural component or system, and complexity of operating environment. Latin Hypercube sampling (LHS) method as advanced Monte Carlo simulation (MCS) has higher efficiency in sampling. It will be chosen and applied in this paper in order to obtain an effective database for building Kriging surrogate models. In this paper, we propose an effective method to have reliability assessment by Latin Hypercube sampling based Kriging surrogate models. This method keeps the certain level of accuracy in prediction of the response of a structural finite element model or other explicit mathematical functions.


Introduction
Uncertainty is an inevitable issue in the process of manufacture, infrastructure, and engineering design. Quantifying and propagating the uncertainty in the simulation or design process as a key component of risk analysis, robustness evaluation or reliability based optimization attracts attention of researchers and designer [1]. The traditional deterministic model is not effective for structural analysis because of avoiding the effects of uncertainties in the parameters. To consider parameter fluctuation in the real operation environment, the Monte Carlo simulation (MCS) is chosen to perform the stochastic simulation. It is one of most popular discrete algorithmic for uncertainty analyses and used with increasing frequency. MCS (sampling based approach) is useful for several reasons. First, the sampling based approach covers the full range of each uncertain variable in a complicated system. Second, modification of the model is not required, and direct estimates of distribution functions are provided. In addition, in the process of sampling, a variety of sensitivity analysis procedures are available. Last but not the least, analysis procedures can be developed and allow the propagation of results through systems of linked models [2].
However, MCS can reach a certain level of accuracy only if a very large number of iterations are performed. It is obvious that MCS methods become computational prohibitive when simulation model is complicated. To be more efficient than the random sampling method, several improved MCS methods with different sampling techniques have been developed. Importance sampling (weighted sampling), is expected to reduce error to zero if the probability density function is properly selected [3]. L. M. Berliner [4] applied importance sampling Monte Carlo method in sequential problems of Bayesian updating. P. Beaurepaire [5] attempted performing importance sampling technique in reliability based optimization of structure. In the literatures, the first-order sensitivity method, as a variance reduction technique, is also utilized to accelerate MCS estimation convergence [6]. The variance reduction techniques are especially important when MCS is applied to estimate small failure probability [7].
Latin Hypercube Sampling (LHS) method operates by subdividing the sample space into smaller regions and sampling within these regions. The produced samples more effectively fill the sample space and therefore reduce the variance of computed statistical estimators [8]. Stein M [9] had research in the large sample properties of simulations using Latin hypercube sampling. Owen [10] and Huntington [11] tested the limitation of Latin Hypercube sampling. Improved LHS have been developed, Stocki R [12] projected samples onto a known subspace to minimize integrated mean square error and maximize entropy. Iman [13] had efforts to reduce spurious correlations, Florian [14] rearranged the matrix of samples based on a transformation of the rank number matrix. Further, methods for constructing orthogonal LHS are proposed to possess enhanced space filling properties [15], utilized iterative optimization methods are applied in LHS in order to reduce spurious correlations [16].
A surrogate model can be thought of as a regression to a set of data, where the data is a set of input-output pairing obtained by evaluating a black-box model of the complex system [17]. Surrogate models may be classified into two general categories based on their proposed: local and global models. An example of optimization using local is Response Surface Method expressed (RSM) as polynomial surrogate models. The kriging models are interpolation models based on the assumption that there is a spatial correction between the values of the function to be approximated [18]. With these models a known or sampled value of the limit state function to be approximated is exactly predicted. The Kriging models do not assume an underlying global functional form as assumed in the polynomial regression models and can approximate arbitrary functions with high accuracy in global as well as local approximations [19].
Kriging model was originally developed in spatial statistics by Krige. Matheron [20] had contribution in mathematical formulation of Kriging model. In the pioneering work of Sacks et al [21], Kriging model was chosen to predict deterministic functions in physical processes. A univariate kriging model combining a regression term with a zero-mean stationary Gaussian disturbance process was developed by Handcock [22]. Van Beer [23] started with the application of Kriging model in random simulation. The track record of application of Kriging models in random simulation holds great promise. This paper presents an effective method to have reliability assessment by Latin Hypercube sampling based Kriging surrogate models. For that purpose, Latin Hypercube Sampling is utilized to build a reliable database for approximating the response of a structural finite element model or other explicit mathematical functions. Once the database is defined, we compare the relative performance of approximation methods to fit the probabilistic response of original models: namely, response surface method (first order and second order polynomial regressions) and Kriging models. Then, reliability assessment is predicted by the surrogate models, which heavily reduced computational cost and also kept the certain level of accuracy.

Latin Hypercube Sampling Method
A compromise method of advanced MCS is Latin hypercube sampling (LHS) approach. This approach divides the range of each variable into disjoint intervals of equal probability, and one value is randomly selected from each interval [24]. It improves MCS stability and also maintains the tractability of random sampling. Consider a statistic system described by the relation: Where the random vector X possesses independent components defined over the sample space S describing n input random variables with marginal cumulative distribution functions P . F is the operator commonly represents computer simulation such as finite element model with propagation of uncertainties in the system. Traditional Monte Carlo methods rely on the so-called Simple Random Sampling (SRS) in which realization of X , Correlated random variables are not considered because of Principal Component Analysis and Nataf or Rosenblatt transformations can be applied to produce a set of uncorrelated random variables from correlated relationships between variables.
LHS divides the range of each vector components 1 2 , , , n X X X ⋯ into disjoint subsets of equal probability. Samples of each vector components are drawn from the respective subset according to, assembled by uniformly randomly grouping the terms of the generated vector components. In Latin Hypercube Sampling approach, the range of all random input variables is divided into n intervals with equal probability, which are restricted within the respective interval avoid the disadvantage of clustering together, as demonstrated in Fig. 1.
The results of Example 1 and Example 2 of MCS and LHS in different number of sampling are presented in Fig. 2 and Fig. 3. To begin generating the LHS, an interval of each feature is chosen at random. The intersection of these intervals in the multi-dimensional feature space is a small hypercube, from which a sample is taken at random. Next, type of interval is selected at random for each feature. A sample is produced at random from that small hypercube. This continues until N samples have been generated. Each interval of each parameter is sampled exactly once in the process. In contrast to random sampling, the entire range of each feature is always represented in a LHS. Unbiased estimates of the sample means of the outcomes are  support the same conclusion. However, in Example 2, when the probability density of output result is very concentrated in a small range, the advantage of LHS is not evident then MCS as in Fig. 3.
To compare MCS with LHS more precisely, here we defined relative error can be calculated as in Eq. (8)

Surrogate Models
A surrogate model can be thought of as a regression to a set of data, where the data is a set of input-output pairing obtained by evaluating a black-box model of the complex system [26]. Surrogate models may be classified into two general categories based on their proposed: local and global models. An example of optimization using local is Response Surface Method expressed (RSM) as polynomial surrogate models. RSM sequentially fits local first and second order regression models to a small region of the overall search space, as in Eq. (9) and Eq. (10).
In the other hand, a global surrogate model is a function that approximates the system across the design space. Kriging models fit a spatial correlation function to a data set consisting of input-output pairs obtained by evaluating the underlying function.
Its prediction at a given point of the space of basic random variables can be obtained, To gain some insight into the behavior of the Kriging surrogate models, we created Kriging models by database of MCS and LHS, which has different amount of sampling. Zero-order, first-order and second-order Kriging models are applied to predict the results when the input variables are changed in Example 1 and Example 2. In Example 1, the ranges of input variables ( 1 X , 2 X and α ) are (4, 7), (0.1, Here, the relative errors are calculated by comparing the difference between the results predicting by Kriging models with the exact results from the analytical function in Example 1 and Example 2.
Where i y is the result predicted by Kriging models, i y is the exact result from the analytical function, they have same input ( 1i X , 2i Example 2. Table 1 and Table 3 present time cost of Kriging models fitting and predicting in Example 1 and Example 2. Table 2 and Table 4 provide the values of relative errors, which demonstrate the difference between the results predicted by Kriging models and the exact results when the input variables are certain. We find that the number of sampling in MCS or LHS which is the original database for Kriging models is the most important factor to time cost. The sampling in MCS or LHS will be transferred to Kriging models as the original database to define parameters in Kriging models. According to the increase of numbers of sampling, the computational cost in Kriging models will sharply grow. This conclusion is proved in both Table 1 and Table 3.
In Table 2, the relative errors from zero-order, first-order and second-order Kriging model are very small. In Example 1, Kriging models have satisfied accuracy as surrogate models to predict. In addition, when the number of sampling in LHS and MCS is same, original database provided by LHS is always better than that of MCS in Example 1. Along to the increase in size of original database from LHS and MCS, results predicted by Kriging models are more precise. However, the computational cost also sharply increases as in Table 1. Therefore, for creating Kriging surrogate model, there is a trade-off between accuracy and computational cost.  The results predicted by Kriging model in Example 2 do not have the same level of accuracy as in Example1. In Table  4, we can find the relative errors are unable to be ignored. To make Kriging surrogate model to be more appropriate in complicated situation, we still have a lot of work to do. In order to track the property of Kriging models, tests of zeroorder, first -order and second-order Kriging models which are based on the database of LHS and MCS are performed and recorded in Fig. 5. Firstly, the results predicted by Kriging models which are based on database of LHS are more approximated to the exact results of analytical function in Example 2 than that based on database of MCS. Therefore, for creating Kriging surrogate models, LHS method is a more effective and competitive method than MCS method. Secondly, at the beginning of the prediction, Kriging models have large fluctuation and the predicted results are not precise when compared with the exact results, this part should be taken into consideration and chosen to be removed or filtered. Lastly, compared the results predicted by zeroorder, first -order and second-order Kriging models which are based on the database of LHS with different number of sampling, choice of best Kriging model depends on the certain situation. It is difficult to define which Kriging model is the best.

Fig. 5.
Kriging models prediction in Example 2 (Fig 5.a, Fig 5.c and Fig 5.e are the results predicted by zero-order, first-order and second-order Kriging models respectively which are based on database of LHS; Fig 5.b, Fig 5.d and Fig 5.f are the results predicted by zero-order, first-order and second-order Kriging models respectively which are based on database of MCS).

Example in Finite Element Model
When the limit state function is implicit and each deterministic sampling is computational expensive, we propose creating Kriging surrogate models to have reliability assessment, LHS is chosen as an effective method to provide original database for Kriging models. Modal frequency is an important area of structural dynamics which has deservedly received much attention. Usually, researchers and designers identify the basic modal frequencies of a specific structural system and avoid the periodic loading coincide with them in order to prevent the damage or failure of resonance. The dynamic equation can be written as, { } x ɺɺ are the first and second derivatives of the displacement with respect to time. Note that the force applied to the system is now a function of time. While mass and stiffness of a structure are measured and relatively easily derived, the mechanism whereby energy is lost through damping is less easily modeled. The viscous damping model represented by matrix [ ] C is commonly but by no means exclusively used, being proportional to velocity. If there is no damping, the equation of motion is For free (unforced) vibrations the following relationship is obeyed The result is Where [ ] φ is the matrix of mass normalized eigen-vector.
In this paper, our finite element model of wing structure, as presented in Fig. 1, is constructed by ANSYS Parameter Design Language. The parameters in the original deterministic model are corresponding with geometrical properties and material properties. Where S is the parameter representing the ratio of area between the two airfoil sections, it is 0.25 as in initial. L and D as presented in the Fig. 6, are 6.25 m and 1.42 m respectively. For material property, Young's module is 7e10 Pa, Poisson ratio is 0.33, and physical density is 2700 kg/m 3 .  The results of modal frequencies of wing structure in the deterministic finite element model are as presented in Table  5. Latin Hypercube sampling method is performed in the deterministic finite element model to calculate the modal frequencies. The parameters corresponding with geometry (S, D, L) and material property (E, P, R) are as input variables in the process of Latin Hypercube sampling, while the modal frequencies of specific wing structure are output variable in each sampling iteration. We chose uniform distribution as probability distribution for input variables. The ranges of S, D, L are (0, 0.5), (0, 3) and (4, 10) respectively. The ranges of E, P, R are (le10, 1e11), (0, 0.5) and (1000, 8000) respectively. The units for input variables keep unchanging as in the deterministic finite element model. We had 10000 samplings for each input variables by LHS method. Fig.7 provides the records of five modal frequencies in the process of stochastic simulation. To be more obvious, the accumulative probabilities of five modal frequencies of wing structure are presented in Fig.8 as numerical statistics. The evaluation of the stochastic simulation in finite element model of wing by Latin Hypercube sampling method is presented in Table 2. The mean value, standard deviation, skewness, and also the minimum and maximum of five modal frequencies for wing structure by 10000 LHS are obtained and concluded in Table 2.    . 9. Tests of Kriging models as surrogate models for finite element model (Fig 9.a and Fig 9. In order to identify one Kriging model as a surrogate model to replace time-cost computation of original model, we should find answers for three questions: firstly, which Kriging model is the most appropriate to this specific case; secondly, how many LHS should be provided as original database for Kriging model; lastly, we set up how many points in Kriging model for prediction. These three questions can be solved in Fig. 9. Fig 9.a presents the results of probabiltiy density of fist modal frequency predicted by zeroorder, first-order and second-order Kriging models based on 1000 LHS and compared with the results of 1000 LHS and 10000 LHS. It is obvious that second-order Kriging model is more approximated to the original finite element model. Fig  9.b which presents the results predicted by Kriging models based on 2000 LHS supports the same point. Then, secondorder Kriging model is chosen as surrogate model to have prediction. In Fig 9.c, we find that the results predicted by second- The convenience of Kriging surrogate model is significant in reliability assessment. Mechanical resonance may cause violent swaying motions and even catastrophic failure in improperly constructed structures including bridges, buildings and airplanes, a phenomenon known as resonance disaster. It is the tendency of a mechanical system to respond at greater amplitude when the frequency of its oscillations matches the system's modal frequency of vibration. Designers struggle to avoid this physical phenomenon happening in the operation situation. However, the traditional deterministic model ignores the effects of uncertainties in the real complicated operation environment. To propagate the uncertainty in the deterministic model of complicated system, the calculation expense is a heavy burden. It is necessary to create an appropriate surrogate model to have prediction and provide reliability assessment.  In the example of finite element model of wing structure, the parameters corresponding with geometry (S, D, L) and material property (E, P, R) are supposed to be uncertain and fluctuate in a specific range in order to simulate the uncertainties in the real situation. To be general, the type of probability distribution is chosen to be Gaussian distribution, as ( , ) for the parameters respectively. The standard deviation is settled by 10% of the mean value for each parameter to simulate the fluctuation. Second-order Kriging model based on 1000 LHS is applied as surrogate model. Then, the relation between the probability property (mean value, variance, etc.) of five modal frequencies and mean value of input variables (the mean value of the input variables in this paper Synchronous increase, they can be set as any other tendency for reliability assessment).
In Fig 10.a and Fig 10.b, we can find that both mean value and variance for all five modal frequency will reduce when the mean value of input variables increase. B is the result of mean value divided by the standard deviation for every modal frequency. In contrary with mean value and variance of modal frequency, B has positive gradient as in Fig 10.c. The difference between two neighbor modal frequencies is an important parameter in structural design. Fig 10.d presents difference of mean value between the second and first modal frequency, between the third and the second modal frequency, and between the fifth and the forth modal frequency has the same tendency, they all have negative gradient. In contrary, the difference of mean value between the fourth and the third modal frequency has positive gradient. Therefore, Fig.10 provides essential information for structural designers.

Conclusion
1. LHS method is more effective than MCS method for providing comprehensive original database to Kriging models. 2. When the relation between the input variables and output results can be expressed as polynomial, Kriging models have satisfied accuracy as surrogate models to predict. Along with the increase in the number of sampling in LHS, the computational cost of creating Kriging models sharply grows, in the same time, the results predicted by Kriging models can be more precise. Therefore, there is a trade-off between computational expense and accuracy. 3. Comparison between zero-order, first-order and secondorder Kriging model is not evident, which one is the most appropriate model depends on the specific relation between the input variables and output results. 4. Kriging models as surrogate models are very promising in reliability assessment. Its advantages of time-saving and high level of accuracy are competitive in uncertainty analysis.