Skip to main content
  • original innovation
  • Open access
  • Published:

A novel hybrid model for bridge dynamic early warning using LSTM-EM-GMM


Early warning of existing bridges is now predominated by deterministic methods. However, these methods face challenges in expressing uncertain factors (such as wind load, temperature load, and other variables, etc.). These problems directly impact the timeliness and accuracy of bridge early warning. This study develops an innovative method for bridge dynamic early warning with high versatility and accuracy. Long short-term memory network model (LSTM), expectation maximization (EM) and Gaussian mixture model (GMM) were employed in the proposed method. Firstly, the LSTM model is used to predict the measured monitoring data (such as deflection, strain, cable force, etc.) in real time to obtain the predicted results. Next, the number of clusters for the EM-GMM model is determined using the Calinski-Harabasz (CH) index. The method aims to comprehensively consider the internal cohesion of the clustering, ensuring accurate and reliable clustering results. Then, the EM-GMM model is used to cluster the random influence error and the predicted value, which can get the probabilistic prediction result of each corresponding random influence error. On this basis, the dynamic early warning interval under 95% confidence level is constructed. This facilitates early warning and decision-making for potential structural abnormalities. Finally, the accuracy and practicability of the method are verified by the comparison of engineering applications and existing specifications. The results demonstrate that the probabilistic early warning method considering the uncertain factors in the complex service environment can accurately achieve the dynamic early warning of bridges.

1 Introduction

1.1 Literature review

The influence of adverse conditions such as external environmental erosion, and overloading of vehicles accelerates the deterioration of bridge structures. The bearing capacity and the mechanical performance of bridges continuously decrease over time. These problems significantly impact the service life of the bridge and the safety of traffic (Xin et al. 2022; Li et al. 2023; Tang et al. 2022). Currently, many bridge health monitoring systems still adopt static threshold values for warning, which are usually conservative and lack pertinence. Furthermore, they can’t provide real-time health condition assessment. Therefore, it is necessary to conduct research on dynamic warning methods for bridge health condition assessment and provide a real-time warning.

In recent years, numerous methods have been developed to provide early warning of bridge structures. These methods can be classified into two categories. One category is based on the finite element model of the bridge. For instance, Fan et al. (2021) utilized the generalized Pareto distribution model and the finite element model to obtain the early warning threshold and proposed an anomaly warning method for cable-stayed bridges based on deflection measurement. Li et al. (2023) proposed a cable-stayed anomaly diagnosis method based on the sum of vehicle cable forces and verified the rationality of the established early warning index by finite element model. To accurately capture the deformation behavior of the bridge main girder under the coupling of temperature and train, Zhao et al. (2019) determined the early warning threshold for the deformation of the bridge main girder based on the mutual updating of the monitoring data and the finite element model. To balance the effects of the calibration coefficients and the data validity coefficients, Wu et al. (2020) proposed an expression for the warning threshold for the structural response characteristics of the bridge based on the theoretical values of the finite element model. However, this category of method mainly relies on finite element calculations to obtain the early warning thresholds, which are pre-set in the structural health monitoring system (SHMs). When the SHMs give an alarm, there is a high probability that the structure has already deteriorated to a certain extent. This category of method can’t provide real-time early warning, because the service status of the bridge has changed before it can sense it. Another category employs monitoring data directly for early warning. Machine learning algorithms play a big role in this category. For example, Buckley et al. (2021) used a dynamic harmonic regression time series model to obtain correlation relationships between strain response variation trend and temperature excitation. It was used to predict the force of a prestressed concrete bridge, and early warning of the bridge structure was achieved. Li et al. (2021) proposed a bridge construction safety risk warning method based on a rough set, sparrow search algorithm, and least squares support vector machine. Using the correlation between longitudinal displacements and temperature signals at the end of bridge main girders, Ni et al. (2020) established a damage warning method for bridge expansion joints based on the combination of Bayesian regression model and reliability theory. Asad et al. (2023) used artificial neural networks and Bayesian optimization algorithms for early warning of long-term horizontal displacements. Considering the significant vibration of cable-stayed bridges under strong wind conditions, Ye et al. (2023) proposed a data-driven method based on the Random Forest algorithm for early warning of the vibration amplitude of girders and towers. Even though machine learning models were used in these studies, they belong to deterministic warning methods. They are not good at expressing the uncertainty in the monitoring data, and cannot assess the magnitude of the error between the actual and predicted values. It is still challenging to achieve timely and effective early warning in practice. Bridges are affected by various uncertainty factors during the service. These uncertainty factors are not only difficult to predict but also lead to changes in the internal force of the bridge (Zou et al. 2016). For instance, the random variations of wind loads will lead to changes in the vibration response of the bridge. Temperature changes cause changes in the properties of bridge materials (e.g., strength, stiffness, and brittleness of concrete and steel) as well as thermal expansion and contraction of members, which cause stresses and deformations within the structure (Morgese et al. 2023, 2024; Tong et al. 2023). Sensor failures or inaccurate calibration issues will lead to deviations or errors in monitoring data. The existence of the above problems may have an impact on the accuracy of the assessment of structural health conditions. Without considering the uncertainty factors, it is hard to make timely and effective predictions for the structure.

Except for supervised learning algorithms, unsupervised learning methods can reveal the intrinsic nature and patterns of data without training a large amount of sample data (Sarmadi et al. 2021ab). It has become a trend to mine the structural data with fewer boundary conditions. A typical method in unsupervised learning is the cluster analysis method. It organizes unlabeled patterns (generally represented as observation vectors or points in multidimensional space) into clusters based on similar attributes (Li and Ikotum 2017, 2022). Currently, the cluster analysis method has been widely used by scholars in the field of dam deformation warning and structural damage detection for its simplicity, high noise robustness, and strong interpretability. For example, to study the spatio-temporal diversity of dam structural deformation behavior, Lei et al. (2022) proposed a comprehensive diagnosis method using cluster analysis and spatio-temporal data fusion. Silvad et al. (2008) used principal component analysis, autoregressive moving average model and fuzzy clustering method to cluster the vibration data of the structure. And the classification of undamaged structures and damaged structures is realized. To categorize substructures with anomalies or damages, Diez et al. (2016) proposed a clustering-based method for diagnosing structural anomalies on bridges. Entezami et al. (2023) proposed an innovative multi-task unsupervised learning method for early assessment of damage in large-scale bridge structures under long-term monitoring. Entezami et al. (2023) proposed a novel unsupervised learning method in terms of double-hybrid learning for damage assessment in bridge structures under different environmental variation patterns. To removing various environmental effects from modal frequencies of bridge structure, Daneshvar et al. (2023) Proposed a locally unsupervised hybrid learning method suitable for different measurement periods and data dimensions. With big data, the clustering method provides powerful tools in statistics, data mining as well as analysis for anomalies and structural early warning problems.

Inspired by the literature review above, due to the fact that the static threshold values method cannot accurately match the actual service status of the bridge, and the deterministic warning method cannot consider the influence of external factors and rely on massive historical data. It cannot achieve effective early warning. Therefore, it is crucial to propose a probabilistic warning method for bridge structures that takes into account the influence of uncertain factors. In this paper, the supervised and unsupervised learning methods are individually combined to obtain more comprehensive early warning information. This study proposes a dynamic early warning method by using long short-term memory (LSTM), and expectation maximization combined with Gaussian mixed model (EM-GMM). Specifically, first, LSTM is utilized to obtain the data bias (i.e., random influence error, which is the same as residuals) caused by uncertainty factors in the bridge monitoring data. Then, EM-GMM is used to calculate the joint probability density value of the random influence error and the predicted data, which obtains the probability density of the monitoring data predicted value. At last, the probabilistic warning interval is calculated for the monitoring data predicted values at the 95% confidence level. Based on the real-time monitoring data, the proposed method can obtain the dynamic warning interval corresponding to the current state of the bridge. It is beneficial to achieve accurate early warning of bridges and ensure safe operation and maintenance of bridges. Field tests demonstrated its practicality and accuracy.

1.2 Organization of the paper

The rest of this paper is organized as follows. In Theoretical background section, the theoretical background of this paper is presented. In Prcethod section, the flowchart and evaluation indexes of the method are elaborated, and the details of the proposed method are introduced. In Field validation section, the proposed method is validated by the actual monitoring data, and the performance of the proposed method is evaluated in comparison with the established specifications. Some main conclusions are drawn in Conclusion section.

2 Theoretical background

2.1 Long short-term memory

Long short-term memory (LSTM) is an improved recurrent neural network. It is specially designed to solve the long-term dependence problem of traditional recurrent neural networks (RNN). LSTM is the preservation of historical information from the previous moment by memory cells, and the selective memorization or forgetting of historical information by forgetting gates. Furthermore, LSTM overcomes the explosion phenomenon and gradient vanishing problem during RNN training. Compared to the traditional RNN, LSTM changes the propagation mechanism of the hidden layer neurons, which makes the internal structure more complex and more expressive.

The extracted bridge health monitoring data should be cleaned before the LSTM model calculation. Data cleansing includes null value checking and invalid value checking. Here, the vacancy value of time series data is determined by averaging the first and last two values, and the invalid value is deleted. In addition, sample monitoring data need to be normalized, which can eliminate the impact of dimensional and order of magnitude differences between the sample data (García et al. 2015; Panda and Jana 2015). It can also facilitate higher convergence and faster computation of predictive models by normalizing the data to [0, 1]. The G can be calculated by:


Where G is the normalization result, and gmax and gmin are the maximum and minimum values of sample data g respectively.

In addition, the LSTM model computational node contains three types of gates (i.e., namely forget, input, and output gates) and a memory cell (Wang et al. 2022; Xin et al. 2023; Li et al. 2018), as shown in Fig. 1.

Fig. 1
figure 1

The architecture of one LSTM cell

The first layer is the forget gate, which determines whether information can pass through the cell state, see Eq. (2). The second layer of input gates, determines what information in the current input vector should be stored in the cell state, see Eq. (3). Next, ht−1 and xt are integrated into a new candidate vector \({\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{C} _t}\), which takes the value in the range [-1, + 1], and the information to be stored is determined by multiplying \({\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{C} _t}\), see Eqs. (4) and (5), respectively. The third layer is the output gate, which determines the output content of each cell, see Eq. (6).

$${f_t}=\delta \left( {{W_f}\left[ {{x_t},{h_{t - 1}}} \right]+{b_f}} \right)$$
$${i_t}=\delta \left( {{W_i}\left[ {{x_t},{h_{t - 1}}} \right]+{b_i}} \right)$$
$${\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{C} _t}=\tanh \left( {{W_C}[{h_{t - 1}},{x_t}]+{b_C}} \right)$$
$${C_t}={f_t} \odot {C_{t - 1}}+{i_t} \odot {\hat {C}_t}$$
$${o_t}=\delta \left( {{W_o}\left[ {{x_t},{h_{t - 1}}} \right]+{b_o}} \right)$$

where δ is the Sigmoid activation function; tanh is the hyperbolic tangent activation function; Wf, Wi and Wo are the weight parameters of the forget gate, input gate and output gate to be optimized during training, respectively; xt is the input value; ht−1 is the output value when the last cell is passed over; bf, bi and bo are the bias parameters of the forget gate, input gate and output gate respectively; represents element-by-element multiplication. As seen from the above, the hidden state ht is both the hidden state passed to the next cell and the output of this cell. Therefore, the output value of the cell and the predicted value of the output layer are calculated by Eqs. (7) and (8), respectively.

$${h_t}={o_t} \odot \tanh \left( {{C_t}} \right)$$
$${y_t}=f\left( {{W_y}{h_t}+{b_y}} \right)$$

where \(f\left( {{W_y}{h_t}+{b_y}} \right)\) is the activation function of the output layer; \({y_t}\) is the predicted value of the output time series signal.

2.2 Expectation maximization with gaussian mixture model

2.2.1 Expectation maximization

The EM algorithm is often used for parameter optimization of GMM. Assume that the set of bridge deflection prediction error values is Y= {y1, y2, …, yn}. The EM algorithm has the following main steps.

Firstly, the initial values are assigned to parameters αk, µk and ∑k of the different Gaussian mixture distribution functions so that αk satisfies the constraints of the condition \({\sum\nolimits_{{k=1}}^{K} \alpha _k}=1\).

Secondly, Eq (9) is utilized to calculate the probability of data points Xi belonging to different Gaussian distribution functions.

$$p(i,k) = \frac{{{\alpha _k}{\varphi _k}\left( {{X_i}|{\mu _k},{\sum _k}} \right)}}{{\sum\nolimits_{j = 1}^K {{\alpha _j}{\varphi _k}} \left( {{X_i}|{\mu _j},{\sum _j}} \right)}},\;{\rm{(}}\forall i,k){\rm{ }}$$

Then, the αk, µk, and ∑k parameters are recalculated for each Gaussian distribution by Eqs. (10), (11) and (12).

$${\mu _k}=\frac{{\sum\limits_{{i=1}}^{n} p (i,k){X_i}}}{{\sum\limits_{{i=1}}^{n} p (i,k)}}$$
$${\sum _k}=\frac{{\sum\limits_{{i=1}}^{n} p (i,k)\left( {{X_i} - {\mu _k}} \right){{\left( {{X_i} - {\mu _k}} \right)}^T}}}{{\sum\limits_{{i=1}}^{n} p (i,k)}}$$

Finally, the above steps of solving for the αk, µk, and ∑k parameters are repeated until the parameters satisfy the convergence conditions, i.e., the parameters converge or the great likelihood function converges. The key parameters of the Gaussian mixture model can be obtained.

2.2.2 Gaussian mixture model

Gaussian mixture modeling is a semiparametric density estimation method, that combines the advantages of parametric and nonparametric estimation, and it is not limited to a specific form of the probability density function. GMM can smoothly approximate density distributions of arbitrary shapes (Gu et al. 2023; Cao et al. 2021). Recently, GMM has often been used in speech recognition and wind power error calculation (Hu et al. 2023; Nassif et al. 2021), and it has achieved better results. GMM models are described in detail in the literature (Yang et al. 2016; Liu et al. 2023; Lu et al. 2019):

$$P(X \mid \theta)=\sum_{k=1}^{K} \alpha_{k} \varphi_{k}\left(X \mid \mu_{k}, \Sigma_{k}\right)$$
$${\phi _k}(X{\text{|}}{\mu _k},{\sum _k})=\frac{1}{{{{(2\pi )}^{\frac{d}{2}}}|{\sum _k}|_{{}}^{{\frac{1}{2}}}}}exp\{ - \frac{1}{2}{(X - {\mu _k})^T}\sum\nolimits_{k}^{{ - 1}} {(X - {\mu _k})} \}$$

where K is the total number of Gaussian mixture models, αk is the weight coefficient of the Gaussian distribution function, αk ≥ 0, \({\sum\nolimits_{{k=1}}^{K} \alpha _k}=1\); φk(xi|µk, ∑k) is the kth Gaussian distribution function; µk and ∑k are the mean vector and covariance matrix of the kth Gaussian model. Alternatively, \(\theta =\{ {\alpha _k},{\mu _k},{\Sigma _k}\} _{{k=1}}^{K}\) and the parameters are estimated in the maximum likelihood set. Each Gaussian model represents a cluster.

It is essential to accurately determine the value of K for GMM. This study utilizes the Calinski-Harabasz (CH) index to determine the number of clusters K for a Gaussian mixture distribution. The method can consider multiple aspects of clustering results and performance to obtain more accurate clustering results. CH index cannot only measure the closeness within a class by calculating the sum of squares of the distances between different points within the class and the class center (intra-class distance), but also measures the separation of the data set by the sum of squares of the distances between the center point of each class and the center point of the data set (inter-class distance). A higher value of the CH index calculation indicates a more accurate choice of K. It also illustrates that smaller intra-class covariances are better, and larger inter-class covariances are better. CH index can be calculated from Eq. (15).

$$S(K)=\frac{{tr({B_K})}}{{tr({W_K})}} \times \frac{{(m - K)}}{{(K - 1)}}$$

where m is the total amount of sample data; BK is the inter-cluster covariance matrix, \({B_K}=\sum\nolimits_{{q=1}}^{K} {{n_q}({c_q} - {c_e}){{({c_q} - {c_e})}^T}}\); WK is the intra-cluster covariance matrix, \({W_K}=\sum\nolimits_{{q=1}}^{K} {\sum\nolimits_{{x \in {C_q}}}^{{}} {(x - {c_q}){{(x - {c_q})}^T}} }\); tr is the trace of the matrix (the sum of the individual elements on the main diagonal of the matrix is called the trace of the matrix). cq denotes the centroid of class q; ce denotes the centroid of the sample dataset; nq denotes the sample data in class q; Cq denotes the sample dataset of class q.

3 Proposed method

Without considering the influence of uncertainty factors (e.g., wind loads, temperature loads, sensor failures or calibration, etc.), SHMs can’t provide accurate early warning. Therefore, this study proposed an innovative new method of bridge safety dynamic early warning considering the errors induced by uncertainty factors. An LSTM-EM-GMM hybrid model was proposed for the real-time dynamic early warning of bridge structure. The process and evaluation indexes of the proposed method are shown in Evaluation index section and Implementation process section, respectively.

3.1 Evaluation index

Generally, probabilistic predictions are represented by prediction intervals (PIs). In this paper, PICP, ACE, PINAW and CWC are used to evaluate the quality of the prediction interval by the proposed method. The specific calculation formula is as follows (Xin et al. 2022; Khosravi et al. 2013):

$$PICP=\frac{1}{{n^{\prime}}}\sum\limits_{{t=n - n^{\prime}+1}}^{n} {{\vartheta _t}}$$
$${\text{ }}{\vartheta _t}=\left\{ {\begin{array}{*{20}{c}} {1,}&{{\text{if }}x(t) \in [{L_t},{U_t}]} \\ {0,}&{{\text{otherwise}}} \end{array}} \right.$$

where [Lt, Ut] denotes the PI value constructed at the tth cycle. The larger the PICP value, the more targets within the constructed PIs, and vice versa.

For a given level of PINC, the smaller the deviation between PINC and PICP, the better the quality of the constructed PIs. The deviation is defined as:


where ACE ≥ 0 means that the constructed PIs are reliable, and ACE closer to 0 represents better quality. Therefore, when ACE = 0, the PIs are optimal.

The PINAW is used to evaluate the width of the constructed PIs, which is shown as follows:

$$PINAW=\frac{1}{{n^{\prime}({x_{Max}} - {x_{Min}})}}\sum\limits_{{t==n - n^{\prime}+1}}^{n} {\left( {{U_t} - {L_t}} \right)}$$

where xMax and xMin are the maximum and minimum values in the test dataset, respectively. As for the different PIs with the same PICP, the narrower width indicates that the constructed PIs are more informative and competitive.

Unlike the above three metrics, CWC can balance both interval coverage probability and interval width. The expression of CWC is shown below:

$$CWC=PINAW \cdot \{ 1+\phi (ACE) \cdot \exp [ - \eta \cdot (ACE)]\}$$

where \(\eta\) denotes the hyperparameter, which usually takes the value of 50 (Wang et al. 2017). ACE is a pre-set confidence level of prediction interval, generally 95%. \(\phi (ACE)\) is the variable of [0,1], which is used to determine whether ACE is within the confidence level of the predicted interval. The smaller value of CWC indicates the higher quality of the constructed PIs.

3.2 Implementation process

The specific implementation process of the proposed method is as follows:

  1. i)

    The bridge health monitoring data is divided into a training set (80%) and a validation set (20%). Subsequently, the prediction results of the validation set can be obtained using the LSTM model. The random influence error is obtained by subtracting the measured signals from the prediction results. It can represent the uncertainty factors faced by bridge structures in complex service environments. The details of LSTM are provided in Long Short-Term Memory section.

  2. ii)

    To accurately assess the degree of deviation between the predicted and measured values, the EM-GMM model was used to calculate the clustering results of the random influence error with the predicted values. Among them, the cluster K of the EM-GMM model is determined by using the CH index. The method can ensure the accuracy of the clustering results and obtain the optimal joint probability density information. The EM-GMM model is described in Expectation maximization with Gaussian mixture model section.

  3. iii)

    The probabilistic prediction results by LSTM-EM-GMM are utilized to calculate the dynamic warning interval with a 95% confidence level. The validity and reliability of the dynamic warning intervals are evaluated by four metrics: prediction interval coverage probability (PICP), average coverage error (ACE), prediction intervals normalized average width (PINAW), and coverage width-based criterion (CWC). These four evaluation indexes are provided in Implementation process section.

This paper has investigated the dynamic early warning of bridge safety. Through the above process, the bridge warning interval can be updated in real-time or regularly. It can ensure that the bridge warning intervals are always consistent with the actual bridge service condition, which will monitor the bridge condition more accurately. The overall framework of the LSTM-EM-GMM hybrid model is shown in Fig. 2.

Fig. 2
figure 2

Flow of bridge safety dynamic warning method

4 Field validation

4.1 Engineering background

In this paper, a cable-stayed bridge with a total length of 1215.878 m was employed. The main bridge is a five-span cable-stayed bridge with unequal-height towers. The longitudinal arrangement of the bridge is (34.5 + 180.5 + 480 + 215.5 + 94.5) m, and the full width of the steel main girder is 23.6 m. The main girders of the bridge are in the form of a steel box superimposed. The bridge tower is a reinforced concrete structure and adopts the form of a portal-shaped bridge tower. The site layout of the bridge is shown in Fig. 3.

Fig. 3
figure 3

The layout of the bridge site

The long-term health monitoring of the bridge includes three parts: load and environmental monitoring, static and dynamic response monitoring of the structure, and local response monitoring of the structure. It mainly contains environmental temperature and humidity, wind speed and direction, rainfall, water level monitoring, vibration, deformation, stress, cable force, steel structure fatigue, and other monitoring content. To verify the effectiveness of the proposed method, the deflection of the main girder from the Global Navigation Satellite System (GNSS) is analyzed. The arrangement of the GNSS monitoring location of the bridge is shown in Fig. 4.

Fig. 4
figure 4

Layout of GNSS deformation monitoring points

4.2 Dynamic early warning

4.2.1 Acquisition of random influence error

The GNSS vertical deflection monitoring data were obtained from the bridge SHMs. The specific time selected was from 22:10 to 22:15 on February 14, 2023, with a sampling frequency of 1 Hz. The monitoring data is shown in Fig. 5.

Fig. 5
figure 5

Measured data of mid-span deflection deformation

The collected deflection data was divided into a training set (80%) and a validation set (20%). The training set was used to train and tune the hyperparameters of LSTM. In this paper, the number of input layer nodes, hidden layer units, hidden layer nodes, and output layer nodes of the LSTM model are jointly determined by referring to the established literature (Wang et al. 2022; Xin et al. 2023; Li et al. 2018) and experience. They are 29, 2, 15, and 1, respectively. And the LSTM model also contains a fully connected layer, which receives response data, and a regression layer. In addition, performance of the optimization modules are generally enhanced by evaluation of appropriate hyperparameters. In the LSTM model, the optimizer is Adam, the maximum epoch is 1000, the initial learning rate is 0.005, and the learning rate drop factor is 0.1. After training the LSTM, the predicted values of the training set and the predicted values of the validation set were obtained, as shown in Fig. 6a and b, respectively.

Fig. 6
figure 6

Comparison between measured and predicted values of deflection data. a Training set. b Validation set

It can be seen from Fig. 6a that the measured values aligned well with the predicted values. Although there is a certain degree of error, which has a small effect on the predicted results. The reason is that measured and predicted values have relatively small differences in statistical parameters (mean, variance, L1 norm and L2 norm). Specifically, the mean, variance, L1 norm and L2 norm of the measured values are − 23.044 mm, 6.791 mm2, 5.5307 and 372.124, respectively. The mean, variance, L1 norm and L2 norm of the predicted values are − 22.970 mm, 6.152mm2, 5.5129, 368.3423, respectively. As can be seen, these differences are not significant. The results show that the measured and predicted values are consistent in terms of overall trends and distributional characteristics. Subtracting the measured and predicted values of the above training set samples and validation set samples, the random influence error can be obtained. It can represent the uncertain factors in the external service environment, as shown in Fig. 7a and b.

Fig. 7
figure 7

Random influence error of deflection data. a Training set. b Validation set

4.2.2 Analysis of the EM-GMM model

During the service of bridge, its safety operations and decisions are subject to a range of random loads/events (live loads, temperature, wind, missing sensor data, drift, etc.). The responses caused by these random loads/events often affect the changes in bridge structural response, which will further affect the prediction results. Hence, it is considered that there is a correlation between the error value and the predicted value, and the joint probability density between the variables can be established. And the EM-GMM model was used to cluster the predicted values and the random influence error. As for Gaussian mixture distribution clustering, the reliability and accuracy of the results can be improved by setting the appropriate K value. The number of clusters K of the Gaussian mixture distribution is determined by utilizing CH index in Expectation maximization with Gaussian mixture model section. The calculation results of the CH index is shown in Fig. 8.

Fig. 8
figure 8

Optimal K value of CH index

As can be seen from Fig. 8, the maximum value of the CH index is K = 2. Therefore, K is taken as 2. The CH index integrates the ratio of intra-cluster distance and inter-cluster distance, which can reflect the separation and tightness of clusters. The accuracy and stability of the EM-GMM clustering number K = 2 is guaranteed by the above calculations.

As can be seen from Fig. 9a, two clusters were obtained by GMM clustering the deflection prediction data with the random influence error data. The center of mass coordinates of cluster 1 is (-26.23 mm, 2.758 mm), which indicates the deflection prediction of -26.23 mm and random influence error of 2.758 mm. Similarly, the center of mass coordinates of cluster 2 is (-25.37 mm, -1.116 mm). By analyzing the two clusters, it can be seen that each data of cluster 1 is closer to the center of mass coordinates than cluster 2, and the random influence error is smaller. It can better describe the real change of bridge deflection. Each of the data in clustered cluster 2 is farther away from the center of mass, which may be caused by increased external disturbances (e.g., multipath bias, temperature, etc.). EM-GMM realizes the classification of sample data according to the probability magnitude of the sample data in different Gaussian distributions. However, the fact that it still belongs to a density estimation method. Figure 9b shows that the higher frequency prediction intervals and random effect error ranges in the joint probability density chart are consistent with the clustering results of EM-GMM. Thus, each probabilistic prediction data with random influence error can be obtained by LSTM combined with the EM-GMM method. The proposed method not only provides good access to the dynamic characteristics and probability density information of the deflection time series data, but also considers the information about the randomness and probability of the data.

Fig. 9
figure 9

Deflection prediction data and random influence error analysis. a GMM clustering results. b Joint probability density

4.2.3 Calculation of dynamic warning intervals

For analytical convenience, the 95% confidence level is taken to calculate the probabilistic prediction results from the proposed method. Subsequently, the corresponding probabilistic prediction interval can be obtained. The result is shown in Fig. 10.

Fig. 10
figure 10

Location of measured deflection monitoring data at 95% confidence interval

Figure 10 shows that the measured values of deflection can be well surrounded by 95% confidence intervals. It is shown that the proposed method has the potential to bridge the health monitoring dynamic early warning. The dynamic warning intervals are evaluated by the PICP, ACE, PINAW and CWC metrics mentioned in Implementation process section. The results are shown in Table 1.

Table 1 Results of evaluation index

As can be seen from Table 1, PICP values demonstrate that these measured values are well included in the early warning interval at a 95% confidence level, and the inclusion rate is as high as 0.9833. The closer the PICP is to 1, the better the warning interval can cover actual monitoring data, which proves the high warning accuracy of the proposed method. The ACE value is 0.0333, and the closer it is to 0, the smaller the difference between the width of the predicted interval and the measured data. Meanwhile, the value of PINAW and the value of CWC are both 0.73392. Therefore, the dynamic warning intervals of the proposed method have excellent quality, high information content and strong competitiveness.

To further verify the accuracy of the dynamic warning intervals obtained by the proposed method, the deflection monitoring data of 22:10–22:15 on February 15, 2023 was obtained from the bridge SHMs. The re-acquired measured deflection data of 22:14–22:15 was put into the above warning interval. The comparison is shown in Fig. 11.

Fig. 11
figure 11

Location of the re-acquired measured values in the dynamic warning interval

As can be seen from Fig. 11, there is some data outside the warning interval near the 55th sample data, and the rest of the data is contained within the warning interval. The primary factors contributing to these problems may include changes in the service environment of the bridge, structural damage, false alarms or failures of sensors, and other related problems. Therefore, the timely and effective dynamic early warning is conducive to the management and maintenance units of the bridge structure and sensors for inspection and maintenance, which can guarantee the safety of the bridge structure. It is proved by field tests that the dynamic early warning interval calculated by the proposed method can realize the dynamic real-time early warning of bridge structures.

4.3 Comparison with traditional methods

To further demonstrate the superiority of the proposed method, the bridge finite element model was established (see Fig. 12), which is convenient to compare with the early warning threshold required by the existing specifications. In this paper, the load combinations and warning threshold settings of four industry specifications published by China are analyzed. They are the Code for Design on Railway Bridge and Culvert (TB 10,002 − 2017) by the State Railway Administration; General Specifications for Design of Highway Bridges and Culverts (JTG D60-2015), Specifications for Design of Highway Cable-stayed Bridge (JTG/T 3365-01), and Technical Specifications for Structural Monitoring of Highway Bridges (JT/T 1037–2022) by the Ministry Transport. These four specifications will be referred to as Specification 1, Specification 2, Specification 3, and Specification 4. Based on the requirements of Specifications 1 and 2, the design values under different load conditions are calculated. At last, the comparison of method validity and utility was conducted by Specifications 3 and 4.

Fig. 12
figure 12

Finite element model of bridge

Due to the bridge belonging to a typical time-varying structure, it is subjected to loads in service mainly consisting of permanent and variable load effects. Permanent load effects mainly include bridge constant load effects, initial tension of cable-stayed cables, shrinkage and creep of concrete, and so on. Variable load mainly includes temperature effect, wind load, moving load, etc., which have obvious time-varying characteristics (Li et al. 2023; Wang et al. 2022). Therefore, the combination of design loads such as temperature and humidity changes, wind loads and live loads should be fully considered in the serviceability limit states. Here, it is worth mentioning that the load combination results of the finite element model do not consider the displacement under constant load. Therefore, this paper calculated the response of various loads based on Specifications 1 and 2. Afterward, the obtained results for each load were combined. The design values of different load combinations are shown in Table 2.

Table 2 Vertical deformation design values under different load combinations

As shown in Table 2, category 2 is the maximum warning threshold interval of vertical deformation, which is [-479.73, 217.60] (mm). The maximum warning threshold values of vertical deformation are calculated according to Specifications 3 and 4 respectively, and the maximum warning threshold levels of different specifications can be obtained. The calculation results are shown in Table 3.

Table 3 Vertical deformation exceedance thresholds of Specifications 3 and 4

The maximum warning threshold of Specifications 3 and 4 were compared with the dynamic early warning interval of the proposed method, and the position of the measured data in the threshold interval of Specifications 3 and 4 was obtained, as shown in Fig. 13.

Fig. 13
figure 13

Comparison of dynamic warning interval with the warning interval of Specifications 3 and 4

Figure 13 shows that the early warning interval calculated by Specification 3 is much greater than Specification 4 and the proposed method in this paper. When the actual condition of the bridge reaches the threshold boundary in Specification 3, it may represent that the risk level of the bridge condition is already high. It is not conducive to grasping the real-time service status of the bridge. It can also be seen that the layered pre-warning intervals of Specification 4 are more effective than in the case of Specification 3, but they are still much greater than the dynamic warning intervals of the proposed method. Additionally, it can be seen from the local magnifying diagram in Fig. 13 that there are two data exceeding the dynamic warning interval, but both Specifications 3 and 4 fail to realize early warning. Comparing the dynamic warning intervals with the results of Specifications 3 and 4 proves the effectiveness and warning capability of the proposed method.

5 Conclusion

To grasp the service status of bridges in real-time, this paper proposes a novel method for dynamic early warning of bridge safety using a hybrid model of LSTM-EM-GMM. The effectiveness and superiority of the proposed method were verified through field tests and a comparison of established specifications. The conclusions can be summarized as follows:

  1. (1)

    The dynamic early warning interval of the proposed method was evaluated by PICP, ACE, PINAW, and CWC, whose calculated results were 0.9833, 0.0333, 0.73392, and 0.73392, respectively. The above results demonstrate that the proposed method has good probabilistic prediction and high-quality of warning intervals. The SSE and CH index are used to determine the cluster number K of Gaussian mixture distribution, which guarantees the accuracy of clustering results. Furthermore, the joint probability density distribution of the predicted value and the random influence error was calculated by EM-GMM, and the probabilistic prediction data of deflection under each random influence error was obtained. It is beneficial to assess the degree of error between predicted and measured values in advance, thereby guiding early warning and decision-making.

  2. (2)

    The dynamic warning interval of the proposed method is more consistent with the actual service conditions of the bridge. The early warning interval determined by the specifications was much larger than the proposed method. When the early warning threshold of specifications is exceeded, the bridge structure may have a serious performance deterioration problem. However, the proposed method can provide earlier warning and adopt some measures to ensure the safe and stable operation of the bridge structure.

  3. (3)

    Due to the instability of the bridge health monitoring data and the limitations of the model, the traditional point prediction can only get an estimated prediction value, and the results will have different degrees of error, which makes it impossible to obtain reliable prediction values and evaluate the values near the prediction value. Nevertheless, dynamic interval prediction provides auxiliary information such as the confidence level of the interval and the width of the interval based on point prediction, which can better assist the decision maker in grasping the development trend of the bridge state.

The proposed method can provide reference for the service status assessment of bridge structures. However, further research is needed on how to classify the dynamic warning interval so that each level can correspond to the safety level of the bridge’s service status.

Availability of data and materials

The data used to support the findings of this study are available from the corresponding author upon request.


Download references


The supports by National Natural Science Foundation of China (Grant Nos. 52278292, 52108475, 52108435), Chongqing Outstanding Youth Science Foundation (Grant No. CSTB2023NSCQ-JQX0029), Chongqing Transportation Science and Technology Project (Grant No. 2022-01), China Postdoctoral Science Foundation (Grant No. 2023M730431), Special Funding of Chongqing Postdoctoral Research Project (Grant No. 2022CQBSHTB2053), Science and Technology Project of Guizhou Department of Transportation (Grant No. 2023-122-001), Chongqing Jiaotong University Postgraduate Research and Innovation Project (CYB23246) are greatly acknowledged.


National Natural Science Foundation of China (Grant Nos. 52278292, 52108475, 52108435), Chongqing Outstanding Youth Science Foundation (Grant No. CSTB2023NSCQ-JQX0029), Chongqing Transportation Science and Technology Project (Grant No. 2022-01), China Postdoctoral Science Foundation (Grant No. 2023M730431), Special Funding of Chongqing Postdoctoral Research Project (Grant No. 2022CQBSHTB2053), Science and Technology Project of Guizhou Department of Transportation (Grant No. 2023-122-001), Chongqing Natural Science Foundation of China (CSTB2022TIAD-KPX0205), Chongqing Jiaotong University Postgraduate Research and Innovation Project (CYB23246).

Author information

Authors and Affiliations



Shuangjiang Li: Conceptualization, Methodology, Software, Validation, Writing-original draft, Writing-review & editing. Jingzhou Xin: Validation, Formal analysis, Supervision, Funding acquisition. Yan Jiang: Funding acquisition, Formal analysis, Writing-review & editing. Changxi Yang: Validation, Formal analysis, Visualization, Writing-review & editing. Xiaochen Wang: Validation, Formal analysis, Visualization, Software. Bingchuan Ran: Formal analysis, Writing-review & editing, Supervision.

Corresponding author

Correspondence to Jingzhou Xin.

Ethics declarations

Competing interests

All authors declare that they have no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, S., Xin, J., Jiang, Y. et al. A novel hybrid model for bridge dynamic early warning using LSTM-EM-GMM. ABEN 5, 8 (2024).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: