Skip to main content

The application of deep learning in bridge health monitoring: a literature review


Along with the advancement in sensing and communication technologies, the explosion in the measurement data collected by structural health monitoring (SHM) systems installed in bridges brings both opportunities and challenges to the engineering community for the SHM of bridges. Deep learning (DL), based on deep neural networks and equipped with high-end computer resources, provides a promising way of using big measurement data to address the problem and has made remarkable successes in recent years. This paper focuses on the review of the recent application of DL in SHM, particularly damage detection, and provides readers with an overall understanding of the missions faced by the SHM of the bridges. The general studies of DL in vibration-based SHM and vision-based SHM are respectively reviewed first. The applications of DL to some real bridges are then commented. A summary of limitations and prospects in the DL application for bridge health monitoring is finally given.

1 Introduction

Deterioration accumulation is inevitable during the life-cycle service of bridges subjected to harsh environments, and the failure of bridges will result in considerable losses of both human life and property. Monitoring the bridge condition and detecting their damages are essential to ensure their serviceability and safety. Traditionally, visual inspection conducted by experienced inspectors is the main method adopted for this mission (Xu and Xia 2012). Nevertheless, the visual inspection is labor-intensive, time-consuming, subjective, and hard to reflect real structure condition alteration in time (Sun et al 2020). Therefore, structural health monitoring (SHM) systems are developed and installed on some bridges with the aims to timely find structural damage or degradation (Housner et al 1997).

SHM lies in sensing and communication technologies, and the recent advancements in both technologies provide chances to acquire monitoring data at an unprecedented speed and amount. Analyzing the accumulated monitoring data to realize SHM has naturally become the priority of SHM research. The methods developed for analyzing the monitoring data can be distinguished into two categories: model-based methods and data-driven methods (Sun et al 2020). The former attempts to update the finite-element model (FEM) of the undamaged bridge in terms of some key parameters against the measurement data, and the differences between its predictions and the measurements indicates the existence of damages (Xiao et al 2015; Zhu et al 2015). However, it is a hard task due to simplifying assumptions when modeling bridge structures and uncertainties of material and geometric properties (Sun et al 2020). Data-driven methods regard the mission as a statistical pattern recognition problem (Farrar and Worden 2012) and have been applied substantially, but their complexity and computation requirements are generally of polynomial order concerning data size (Sun et al 2020). Moreover, computer vision (CV) technologies are also used to detect local damage, such as cracks, spalling, delamination, and rust, and to extract global information, like displacement, acceleration, loads, from images or videos captured by cameras. The CV technologies usually face disturbances caused by light, distortion, weather, and occlusion in the outdoor environment.

The appearance of machine learning (ML) provides a possible solution for the troubles mentioned above. As a branch of artificial intelligence, ML aims to develop trainable algorithms to learn from data, based on which predictions can be made (Pan et al 2017; Pan et al 2018). The artificial neural network (ANN) is a classical ML method and has been applied to civil engineering since 1989 (Adeli and Yeh 1989). However, using ML requires knowledge and experience in designing features for a specific SHM, which may not be practical as the monitored systems become more complex (Zhao et al 2019). In recent years, along with the significant improvement of network architecture and computing capacity, deep learning (DL), which aims to automatically extract features from raw data via stacked blocks of deep neural network (DNN) layers (Cha et al 2018; Mosalam et al 2019), has drawn researchers’ attention and has been successfully applied in various areas including CV, natural language processing and audio recognition. Each layer in a DL model will learn a new feature from the data, and thus DL is an end-to-end system that does not need human intervention in the design of features, which makes DL-based SHM applicable widely with minimum knowledge about the specific features (Azimi et al 2020). Some DL models, such as fully connected neural network (FCN), long short-term memory network (LSTM), convolutional neural network (CNN), autoencoder (AE), deep belief network (DBN), deep Boltzmann machine, and generative adversarial network (GAN) have already shown their reliability in analyzing vibration data. The efforts trying to improve the robustness and generalization of CV techniques using DL have also obtained desirable achievements, which accelerate the development of vision based SHM.

Although there are several reviews about recent advances in SHM (Ahmed et al 2020; Azimi et al 2020; Jeong et al 2020; Sun et al 2020; Bao and Li 2021; Dong and Catbas 2021; Pal et al 2021; Sofi et al 2022; Zhang et al 2022a), this article focuses on the application of DL in bridge health monitoring in the last 4 years and tries to provide promising directions after summarizing current challenges and trends. The remaining of this paper is organized as follows: Section2 and Section3 summarize the general studies of DL in vibration- and vision-based SHM, respectively. In Section 4, the SHM systems with DL successfully applied in practice are listed and commented. Finally, this paper ends with a summary of current limitations and some directions required to be noted and pursued.

2 Vibration-based structural health monitoring

Vibration-based SHM has been investigated since the 1990s as its advantages including the global coverage of structural topology and natural availability of vibration signals (Li et al 2015). All of the methods in this area are founded on the premise that any damage occurring on structures will result in changes in vibration signals (Xu 2018), which makes identifying and locating damages viable by monitoring structural vibration signals.

For data-driven methods in this area, damage identification can be disposed as a statistical pattern recognition problem (Farrar and Worden 2012). Some characteristic indexes are selected and extracted from the measured data, based on which the state of the structure is classified into the scenario with the closest values. These methods have been substantially applied, but some insufficiencies should not be ignored. For instance, because the computation burden coming from big data, the traditional methods usually consider small datasets that are assumed to be sampled from a particular distribution (Sun et al 2020). Another drawback is that their complexity and memory requirement are generally of polynomial order to data size (James et al 2011).

As an end-to-end system, DL provides a powerful solution for the challenges and gained so much attention from the engineering community, because of its excellent capability in extracting features from raw data. This section will review the application of DL in data driven SHM methods from the perspective of data preprocessing, damage classification, and data novelty detection. The studies taking spatial information, which is essential but not emphasized before, are concluded at the end of this section.

2.1 Data preprocessing

2.1.1 Anomaly identification

At present, various sensors in a SHM system are the main source of vibration data. However, sensor failure, transmission interruption, and so forth, inevitably ruin the data, which seriously affects the data analysis results. To detect structural damage and assess structural condition correctly, identifying and eliminating anomalies in monitored signals are the first task.

Anomaly identification for vibration signals can be disposed as time series classification, for which the effectiveness of one-dimensional (1D) CNN has been proved by some researchers (Jian et al 2021; Zhang and Lei 2021). Visualizing the time signals into images is an effective approach to leverage two-dimensional (2D) CNN for this mission. And drawing their curves directly is the most frequently adopted method, while other feature engineering approaches, like the spectrogram analysis, the probability density function and so on, were also employed to enhance the network’s robustness (Jian et al 2021; Shajihan et al 2022). For example, Tang et al (2019) visualized time series data in time and frequency domain and stacked them as a single dual-channel image before it was inputted into a 2D CNN to classify data anomalies (see Fig. 1).

Fig. 1
figure 1

The anomaly detection method proposed by Tang et al (2019)

It is noteworthy that imbalanced time series are common in practical engineering as normal data are always larger than abnormal ones. CNN is difficult to reach high classification accuracy in class-imbalanced situations since it is based on class-balance hypothesis (Yin and Gai 2015). To overcome the problem, Liu et al (2022) developed a GAN- and CNN-based data anomaly detection framework, which includes three modules: (i) three-channel input based on visualization, fast Fourier transform (FFT) and Gramian angular field of time series signals; (ii) GAN trained to extract features from normal samples; (iii) CNN employed to distinguish the types of anomalies. Adopting the focal loss function was also a method to soften the class imbalance-induced classification bias (Du et al 2022).

Although satisfying performance has been achieved in classifying anomaly patterns, differentiating sensor faults and structure damages had not been considered until Li et al (2021a) proposed an isolation strategy. In the proposed strategy, a fully connected stateful LSTM network, which was an improvision of LSTM by adding fully connected layers, was used to predict acceleration signals of the selected sensor, and the residual between the prediction and the measured values was regarded as an anomaly index. An anomaly occurring on all sensors of a substructure indicates the existence of structural damage. Otherwise, a fault in one of the sensors is found.

2.1.2 Missing data recovery

Not only can sensor failure and communication error lead to data loss, but also the anomalous signals are often processed as missing data, yet too much missing data, have unfavorable influences on SHM results. Correction analysis between different sensors, built by the partial least square method (Lu et al 2017), nonparametric copulas (Chen et al 2019), and so on, is the frequently used methods to recover the missing data. The DL potential in mining the relationship between the inputs and making predictions also makes it widely adopted for this mission.

The strain data measured before the occurrence of data loss was converted into a grayscale image and then used to train a CNN so that the net could recover the strain responses of the failed sensors according to the remaining sensors’ data (Byung et al 2020). Fan et al (2019) proposed a novel CNN architecture with bottleneck architecture and skip connection to construct the nonlinear relationships between the incomplete signal and the complete true signal and proved its outstanding capability for data recovery, even when the signals have severe data loss ratios up to 90%. Liu et al (2020) verified the accuracy of LSTM in recovering temperature data and demonstrated that incorporating more intact sensor data and selecting the sensor data highly correlating with the missing data as the input would further improve the recovery accuracy. Li et al (2021b) proposed a “divide and conquer” strategy for this mission. The core concept of the strategy was the prediction of the subsequences of the measured data, which were decomposed by empirical mode decomposition (EMD), rather than directly predicting the time series, as the decomposition could assist in the modeling of the irregular periodic changes of the measured signals using LSTM.

After the spatial correlations among the sensors were considered, temporal correlations drew the attention of Jeong et al (2019), and a bidirectional recurrent neural network using spatiotemporal correlations to recover missing data was developed. Jiang et al (2021a) also used GAN to directly compute the missing data based on the remaining observed data with the spatial-temporal relationship considered.

2.2 Damage scenario classification

Simulating damage scenarios composed of different locations and discrete levels in FEM or experimental structures is the most frequent method adopted recently to generate training samples, based on which various DL models were trained to classify the signals from unknown bridge state. Ibrahim et al (2020) investigated the impact of noise on the performance of several ML algorithms in classifying structural damage severity according to acceleration data and demonstrated that CNN could resist noise better than the K-nearest neighbor, support vector machine, and traditional high-pass filter noise cancellation methods.

Integrating vibration signals from multi-sensors into a multi-channel time sequence and segmenting it via a sliding window is one of the approaches to leverage 2D CNN in this area (Khodabandehlou et al 2019; Lee et al 2020a). Teng et al (2020) combined acceleration signals collected by 13 accelerometers and inputted them into 2D CNN to conduct structural damage identification (see Fig. 2). The CNN trained using finite element analysis data reached 94% accuracy for damages in the numerical model and 90% for damages in the real steel frame.

Fig. 2
figure 2

The time series integration scheme adopted by Teng et al (2020)

Encoding time series into images through various algorithms, such as wavelet transform (WT) (Mangalathu and Jeon 2020), continuous wavelet transform (Chen et al 2021), FFT (He et al 2021a), Fourier amplitude spectra (Duan et al 2019), is another method to employ 2D CNN. Mantawy and Mantawy (2022) encoded time-series data, including accelerations, drift rations, and both, into images using three approaches: Gramian angular summation field, Gramian angular difference field, and Markov transition field (MTF) (see Fig. 3). The comparison showed that CNN trained on MTF encoded images reaches 100% accuracy during the training phase and more than 94% for the testing phase.

Fig. 3
figure 3

Time series encoding scheme adopted by Mantawy and Mantawy (2022)

Considering that the overall dynamic vibration of a structure may be insensitive to local damage, He et al (2021b) employed wavelet packet transform to extract more sensitive damage features from structural acceleration signals and used recurrence analysis to obtain the periodicity, non-stationarity, and chaos of the signals, whose result can be visualized by a recurrence graph. Then, the recurrency graph was fed into a CNN to classify the structural damage conditions. Compared with traditional methods, the proposed method showed excellent accuracy in identifying the location and degree of minor damages.

Most of the methods mentioned above rely on the stationary assumption, which fails in practice since non-stationary ambient excitations are inevitable. Li et al (2021c) proposed a new recurrence plot, named un-threshold assembled recurrence distance matrix, to reveal intrinsic dynamic characteristics of the structure (see Fig. 4). Different from traditional single-label model that regards each combination of damage location and level as one objective class, they developed a multi-label CNN to decouple the identification process of damage location and levels. Every sub-branch of the net was trained using an independent dataset to evaluate the damage level at each location before the damage location was identified by fusing information from all of the sub-branches.

Fig. 4
figure 4

Flowchart of structural damage identification using the multi-label CNN model Li et al (2021c)

Inspired by the excellent performance of 2D CNN as shown above, 1D CNN was also employed to detect tiny local structural stiffness and mass changes according to the acceleration records from a single sensor and achieved perfect performance (Zhang et al 2019a; Sharma and Sen 2020). For example, Teng et al (2021) trained seven 1D CNNs using the acceleration signals collected by corresponding sensors and fused all of their classification results at the decision level to obtain the integrated detection results. Compared with data-level fusion, in which all acceleration signals were integrated into a multi-channel time sequence, the decision-level fusion improved the classification accuracy by 10% and 16–30% for the numerical and experimental models, respectively.

Except for CNN, other DL models, like deep residual network (Alazzawi and Wang 2022), ANN (Hormozabad and Soto 2021), recurrent neural network (RNN) (Jena and Parhi 2020), stacked Autoencoder (SAE) (Silva et al 2021) can also exert their influence in this purpose. A sequence of windowed samples extracted from acceleration responses was used to train a LSTM for damage scenario classification by Sony et al (2022) (see Fig. 5). The experimental results demonstrated that the method outperforms 1D CNN on the Z24 bridge. Xiao et al (2021) optimized a deep autoencoder (DAE) using gray relational analysis to extract high-level features from raw signals, according to which a classifier, Softmax, was trained for the classification. Considering the difficulties in optimizing the weights of deep neural network, Pathirage et al (2018) developed a framework with two components for this mission. The first component was used to reduce the dimensionality of the vibration signals while preserving the necessary information, and the second part is to learn the relationship between the features and the damages. Rastin et al (2021a) presented a two-stage method for this mission, in which a deep convolutional GAN trained using the intact state data was firstly used to detect the existence of damage that could be quantified by the discriminator’s output. The detected damage was then localized via a conditional GAN trained by labeled data from damaged states.

Fig. 5
figure 5

The sequence of windowed samples extracted for the training of LSTM (Sony et al 2022)

Other structural modal information, such as natural frequencies and mode shapes (Pathirage et al 2019; Wang et al 2021), are also sensitive to damage. Yang and Huang (2021) introduced the flexibility curvature index that did not need the information of intact structures as the input of a CNN to realize damage identification. Nguyen et al (2020) trained a CNN using the images from the damage index of the gapped smoothing method to classify the damage location in a numerical beam.

To solve the problem of poor anti-noise ability faced by traditional methods, Guo et al (2020) developed a damage identification method based on DBN. After three restricted Boltzmann machines were pre-trained using the damage index, modal curvature difference, a Softmax classifier and a neural network were employed to identify the damage location and degree, respectively. The experimental results showed that DBN had strong anti-noise ability, compared with backpropagation neural networks.

Fig. 6
figure 6

Flowchart of the novelty detection strategy proposed by Mousavi and Gandomi (2021b)

Compared with the number of degrees of freedom of a structure, the number of sensors in a SHM system is often finite or even insufficient. Continuous deflection of a bridge measured by fiber-optic gyroscope, which could cover the whole structure, was thus mentioned, and an 1D CNN was employed to analyze it for damage classification by Li et al (2020) and Li and Sun (2020). Distributed optical fiber sensor based on Brillouin optical time-domain analysis technology exhibited a great facility to measure strain distributions along the whole surface of structures, but its low signal-to-noise ratio limited its application in crack detection. Song et al (2020) thus employed SAE to extract features from its raw data and trained a Softmax classifier to decide whether micro-cracks exist.

2.3 Novelty detection and quantification

Despite the excellent success of DL models achieved in damage scenario classification, the lack of training data restricts their application in practice. Preparing sufficient training data is not only laborious and uneconomical in the laboratory, but also impossible in real engineering. Generally, only normal vibration data can be obtained from new structures since damages cannot be applied to structures under commercial operation, which encourages the application of unsupervised learning. For this kind of methods, DL models are trained to reconstruct the vibration signals. Because only signals from intact structures are available, the reconstructed signals will be away from the measurements when damages exist, which means the reconstruction error is sensitive features indicating the existence of damages.

After seasonal patterns were removed by variational mode decomposition (VMD) algorithm, Mousavi and Gandomi (2021a) used the natural frequency and corresponding Johansen cointegration residuals of a structure to train RNN, and the prediction errors for new measurements were regarded as the index of damages (see Figs. 6 and 7). They also trained a bidirectional LSTM with healthy structure signals denoised by VMD and their Mahalanobis distances calculated by minimum covariance determinant for this mission (Mousavi and Gandomi 2021b). The method required only a couple of low structural natural frequencies. Therefore, it is recommended for cases when the measurements from the environmental and operational variations are not available.

Fig. 7
figure 7

Prediction results and errors in the numerical example conducted by Mousavi and Gandomi (2021b)

Apart from the reconstruction errors, Lee et al (2021) trained a one-class CNN to detect novelty in acceleration data that had been transformed into images through WT. It was found that the minimum damage the method could find was at least a 15% reduction of the stiffness. Based on the essential features extracted from acceleration history by variational autoencoder, Ma et al (2020) adopted the features of Euclidean distance between the first segment and others as the damage index, whose curve could be used to observe whether there was a sudden change caused by the damage.

To quantify damage, Rastin et al (2021b) trained a convolutional autoencoder using the multi-channel signals acquired from a healthy structure to extract sensitive features and calculated the distance between the features and the reference vectors, but a threshold for the distance needs to be specified according to engineers’ experience. Silva et al (2019) trained an AE to eliminate the influence of environmental factors, and then the structure damage was quantified by calculating the residual between its inputs and outputs.

2.4 The function of spatial information

The methods mentioned above use either spatial relation (e.g., using CNN) or temporal relation (e.g., using LSTM) only rather than the combination of them, which may improve the damage identification accuracy significantly. CNN and gated recurrent unit (GRU) were combined by Yang et al (2020) to model both spatial and temporal relations for damage detection and the enhancement it brings was also demonstrated. CNN was utilized to model the spatial relations and the short-term temporal dependency among sensors while its output features were fed into the GRU to learn the long-term temporal dependency jointly. Fu et al (2021) fused the features extracted by CNN and LSTM by FCN for bridge damage scenario classification (see Fig. 8). The combined model, named CNN-LSTM, reached 94% accuracy for damage localization and only 8.0% of the average relative identification error for damage severity identification, both better than CNN. Dang et al (2021) combined underlying features extracted by autoregression model, discrete wavelet transform, and EMD from measured acceleration signals, and inputted them into the proposed hybrid DL framework, named 1D CNN-LSTM for damage identification. Through three case studies, they demonstrated that the framework achieved accuracy as high as 2D CNN but with lower time and memory complexity. Zhang et al (2022b) leveraged LSTM-FCN by assigning the time series of cable forces and their ratios between cable pairs under intact conditions as the input and the identity number of cable as the corresponding labels to recognize damaged cables.

Fig. 8
figure 8

The CNN-LSTM-based damage identification proposed by Fu et al (2021)

Graph neural network (GNN) provides another approach to model spatial correlations among sensors. Li et al (2021d) developed a spatiotemporal graph convolutional network to analyze spatiotemporal correlations among cable forces, in which the spatial dependency of the sensors was represented as a directed graph with cable dynamometers as vertices. The learnable adjacency matrix was used to capture the spatial dependency of the locally connected vertices and a 1D CNN was operated along the time axis to capture the temporal dependency. Son et al (2021) mapped cable tension to graph vertices and the connection relationship between sensors to its edges, and trained a GNN framework, the message passing neural network, to localize the damaged cables and estimate their area.

3 Vision-based structural health monitoring

Vibration-based methods rely on dynamic responses measured by contact sensors, such as accelerometers, strain gauges, and fiber optic sensors, which are expensive in installation and maintenance. The appearance of non-contact sensors, including digital and high-speed cameras, unmanned ground vehicles, and mobile sensors, which are more cost-effective and easier to deploy, provides another promising solution for the SHM of bridges and has attracted much attention in recent years. Unlike contact sensors, non-contact sensors yield images or videos that require advanced image processing techniques to interpret. Traditional image processing methods rely on various edges or boundary detection techniques, such as Sobel edge detector, morphological detector, and template matching to extract features from the images. However, these methods often result in ill-posed problems due to disturbances created by environmental conditions including light, distortion, weather, shade, and occlusion in outdoor environment (Yao et al 2014).

CV aided by DL helps researchers and engineers overcome the challenges due to their reduced sensitivity to external disturbances and excellent capability in feature extraction. Dong and Catbas (2021) presented a general overview of CV-based SHM at the local level (SHM-LL) and global level (SHM-GL). The former includes applications such as crack, rust, and loose bolt detection or quantification, while the latter means displacement measurement, structural behavior analysis, load monitoring, and damage identification. The relation between SHM-LL and SHM-GL is bidirectional: (i) the process of understanding the input-output structural behavior, which is one of the tasks of SHM-GL, can benefit from the condition assessment from SHM-LL; and (ii) the global condition evaluation and damage detection from SHM-GL can assist the SHM-LL to understand how localized conditions and damage affect the complete system (Dong and Catbas 2021). This section will review the recent applications of DL in CV-based SHM from the two perspectives.

3.1 SHM-LL

3.1.1 Image classification

Identifying whether defects exist in the image and classifying images according to the defects they contain are effective ways of detecting surface damage. Benefiting from DL’s excellent performance in image classification, many researchers pay attention to its application in SHM and have obtained impressive achievements.

Quqa et al (2022) trained a CNN to classify the images of the welding joints of a long-span steel bridge as damaged or undamaged. Ebenezer et al (2021) developed an ensemble of three CNN models, custom CNN, Xception, and AlexNet, using the majority voting scheme to improve the classification accuracy for the concrete deterioration in bridges, and a validation accuracy of 87.1% was achieved. Transfer learning is an effective way to accelerate the DL models’ training and improve their accuracy even with fewer training data. Several pre-trained nets, including VGG-16 (Perez et al 2019), Inception v3 (Zhu et al 2020), GoogLe Net have been used for this purpose (Holm et al 2020; Chen 2021; Savino and Tondolo 2021). Savino and Tondolo (2021) fine-tuned eight pre-trained CNNs, including AlexNet, SqueezeNet, ShuffleNet, ResNet-18, GoogLeNet, ResNet-50, MobileNet-v2, and NASNet-mobile, to conduct concrete surface damage classification, and the GoogLeNet reached 94%, the highest accuracy. The appearance of the attention mechanism further improves the performance of DL models, and some new methods integrating it have yet been proposed. For example, a convolution-based multi-damage recognition neural network combined CNN with an attention network and hybrid pooling layers was developed by Shin et al (2020) to classify the five damage types and an accuracy of 98.9% was achieved. Cui et al (2021a) proposed a geometric attention regulation method, in which the bearing location information was marked by a bounding box worked as an attention mechanism to indicate the important part of the input image. The experiments proved that the method could enhance CNN’s performance effectively.

Most of the existing methods perform well in detecting surface defects according to optical images, but there is still a lack of systems that are able to identify subsurface damages, such as concealed cracks (particularly, bottom-up cracks) and debonding between paint and steel surfaces. To overcome the trouble, Ali and Cha (2019) tried to feed thermal images into a deep inception neural network to detect subsurface damage of a steel truss bridge, including corrosion and debonding between paint and steel surface (see Fig. 9).

Fig. 9
figure 9

The subsurface damage detection method proposed by Ali and Cha (2019)

Despite the advantages CNN shows in the area of image classification, environmental impacts still hinder its application in practice. To further improve the accuracy, Qiao et al (2021) designed a new algorithm, called EMA-DenseNet, by adding the expected maximum attention (EMA) module to a DenseNet. Besides, a new loss function considering the connectivity of pixels was designed to reduce the breaking point of fracture prediction. The experiments showed that the mean pixel accuracy, mean intersection over union, precision, and frames per second of the Net reached 87.42%, 92.59%, 81.97%, and 25.4, respectively.

Another trouble CNN faces is that its receptive field generally is so small that many stacked layers are needed to cover the whole image. Compared with CNN, transformer has great flexibility in modeling global context and introduces less inductive bias, but its self-attention mechanism brings heavy computational cost. To address this issue in classifying defects of reinforced concrete bridge, Wang and Su (2022) proposed a hybrid network by inserting a transformer into the CNN backbone, and the multilayer perceptron following them generated the final classification results. Experimental results showed 0.949, 0.896, 0.776, 0.844, 0.745 and 0.899 F1_score for the six damage types, respectively, which are greater than the four networks: EfficientNet B1, RegNetX-800MF, MobileNet V3, and ReXNet.

3.1.2 Object detection

Different from image classification, the techniques developed for object detection provide tools to identify several types of damage contained in the same image. Region-CNN (R-CNN) and you only look once (YOLO) are the most models adopted for this purpose.

Deng et al (2020a, 2021) applied Faster R-CNN and YOLO v2 to label cracks and handwriting contained in raw images, respectively, and the comparative study showed that YOLO v2 performs better in terms of both accuracy and inference speed. Cui et al (2021b) trained YOLO v3 to identify wind erosion areas on the concrete surface, and an accuracy of 96.32% was achieved. Zhang et al (2020) transferred YOLO v3 with fully pre-trained weights from a geometrically similar dataset to detect four types of concrete damages (i.e. crack, pop-out, spalling, and exposed rebar), and proved that it outperforms the original YOLO v3 and Faster R-CNN with ResNet-101. Mondal et al (2020) compared the performance of four Faster R-CNN models, including Inception v2, ResNet-50, ResNet-101, and Inception-ResNet-v2, in detecting four different damage types and found that Inception-ResNet-v2 significantly outperforms the other networks in the mission.

3.1.3 Semantic segmentation

Semantic segmentation that can label each pixel of the image with the pre-defined labels enables researchers to mark the damage location and shape more precisely. Ye et al (2019) demonstrated the superiority of DL-based methods in crack segmentation by comparing the performance of the FCN called Ci-Net and that of traditional edge detection algorithms. Rubio et al (2019) used FCN to segment delamination and rebar exposure from bridge inspection images, but the method could not accurately detect small damages.

For increasing the segmentation accuracy for cracks in images with complicated backgrounds, non-uniform illumination, irregular shapes, and interference, various modifications have been explored for standard networks. A crack-like kernel, which is rectangular rather than square, was introduced by Lee et al (2020b) to SegNet so that it could extract features representing cracks more precisely. Miao et al (2019) inserted a combined sequence-and-excitation (SE) and ResNet block into a U-Net to improve its performance in segmenting spalls and cracks. Jiang et al (2021a) proposed HDCB-Net, a network with the hybrid dilated convolutional block (HDCB), to expand the receptive field of convolution kernel and to avoid the gridding effect generated by the dilated convolution. Furthermore, a two-stage strategy was proposed to realize fast crack detection: in the first stage, YOLO v4 was employed to filter out images without cracks and generate coarse region proposals, from which the HDCB-Net then detected pixel-level cracks in the second stage.

The digital images acquired through unmanned aerial vehicles (UAVs) often suffer from motion blur, which may degrade the corresponding crack detectability. Bae et al (2021) proposed an end-to-end deep super-resolution crack network for resolve this problem. In the first stage, a super-resolution image was generated for the corresponding raw images using a CNN with residual groups and upscaling layers, which was segmented by a DAE composed of CNN in the following stage. The validation test on concrete bridges demonstrated that 24% improvement in detection accuracy was achieved, compared with the crack detection results using raw digital images.

Considering the significant imbalance between background and crack pixels, which results in good performance in classifying background pixels while performing poorly in identifying cracks, Sajedi and Liang (2019) investigated three different optimization strategies, including UW (uniform weights) -MAP (maximum a-posteriori probabilities), MFW (median frequency weight) -MAP and UW-ML (maximum likelihood), in improving a fully convolutional encoder-decoder neural network’s robustness against the imbalance, and found that UW-ML strategy achieved the best results among them. Han et al (2020) designed a crack segmentation network combining U-Net with a ternary classifier, which significantly reduced the false positive rate, to overcome the same challenge. Deng et al (2020b) adopted the weight balanced intersection over union (IoU) loss function rather than cross-entropy loss or focal loss in the training process of the link atrous spatial pyramid pooling (ASPP) network, in which a modified ASPP module was introduced to LinkNet for segmenting tiny damages.

3.1.4 Damage quantification

After damages are detected, quantifying their severity becomes another important mission for evaluating structures’ condition correctly. For cracks, width and length are the most typical parameters. The segmentation results display cracks clearly and, thus, have become the basis of most crack quantification methods. After the binary maps of cracks were obtained by a dual-scale CNN, Ni et al (2019) proposed a crack width estimation method based on the Zernike moment operator, but its performance for cracks narrower than 2 pixels and under adverse conditions (e.g., dark lighting) seems not very well, and time-consuming is another drawback. Yang et al (2021) employed CNN combined with U-Net to extract crack pixels and their midline. The non-uniform width along the crack was extracted according to the proposed crack-width direction identification method, and pixel calibration experiments were then conducted to establish the nonlinear mapping model among pixel size, shooting distance, and focal length, based on which the actual width of the cracks could be obtained. The results of the verification experiments showed that the recognition precision has achieved at 0.01 mm.

Counting the proportion of the pixels belonging to diseases in all pixels is a workable method to quantify damages like corrosion. Wang et al (2020) proposed a standardized structural health evaluation method and based on it to quantify the damages in the photos of a steel box girder, which were synthesized into panoramas by image stitching technology, and a U-Net was employed to segment the diseases in it. For bolt losing quantification, the Hough line transform -based image processing algorithm was designed to estimate the bolt angles according to the bolt images cropped by R-CNN (Huynh et al 2019). Huynh (2021) designed an autonomous vision-based bolt-looseness detection method with a Faster R-CNN-based bolt detector, an automatic distortion corrector, an adaptive bolt-angle estimator, and a bolt-looseness classifier. Then, the method was applied in a realistic joint of the Dragon Bridge in Danang, Vietnam.

3.2 SHM-GL

3.2.1 Vibration monitoring

Apart from visible damages, vision-based methods are also efficient ways to provide vibration signals to identify invisible damages. Deng et al (2020c) developed an intelligent non-contact remote sensing method in which a uniaxial automatic cruise acquisition device was designed to collect image sequences from bridge surface before they were inputted into a three-dimensional (3D) CNN to identify the envelope spectrum of the holographic deformation. Then, the deflection curvature difference was used to identify the change of damage location and degree. Their experiments demonstrated that the holographic deformation is higher sensitive in damage identification than the limited number of measuring points.

Furthermore, cable forces estimation of urban bridges, according to the drone-captured video, has been realized by Zhang et al (2021). Firstly, a pre-trained FCN was adopted to identify bridge cables and further extract their displacement. Then, EMD was employed for extracting cable vibration signals and eliminating the effect of drone motion. Finally, natural frequencies of the cables were obtained by performing Fourier analysis on extracted cable vibration and further adopted for cable force estimation.

In traditional vision-based vibration measurement methods, template matching algorithm and corner detection algorithm are usually used to track and locate the target, but they are sensitive to the quality of images, which often is poor due to insufficient illumination or fog. Xu et al (2021) thus proposed a distraction-free displacement measurement approach by integrating DL-based Siamese tracker with correlation-based template matching. The DL-based Siamese tracker applied deep feature representations and learned similarity measures for image matching and also considered adaptive template updates with time. The method was then implemented on a short-span footbridge and a long-span road bridge, where its potential to handle challenging scenarios including illumination changes, background variations, and shade effects, was demonstrated. Shao et al (2021) combined the MagicPoint network and the SuperGlue network to achieve target-free full-field 3D vibration displacement measurement and demonstrated the combination’s accuracy compared with traditional sensors, while the combination is more cost effective. Furthermore, they (Shao et al 2022) employed a phased-based video motion magnification algorithm to achieve a higher accuracy of tiny vibrations at the submillimeter level.

3.2.2 Component identification

After various damages are detected, the rating of a structure needs to be provided by a comprehensive assessment in which importance of different components should be considered (Zhu et al 2010). This requires spatially relating identified damages with structural elements. However, inspection images, especially captured by aerial inspection platforms, usually contain complex scenes, wherein structural elements mix with a cluttered background. Extracting structural elements from complex images and sorting them is thus meaningful for SHM.

With a small dataset labeled by inspectors, Karim et al (2021) transferred a Mask R-CNN to segment multi-class bridge components from the videos captured by an UAV. False negatives were recovered by the temporal coherence analysis and a semi-supervised self-training method was developed to engage experienced inspectors in refining the network. The model’s performance reached 91.8% precision, 93.6% recall, and 92.7% F1-score.

Point clouds in 3D space can also provide sufficient information for this purpose. Kim et al (2020) extracted a high-resolution set of point clouds from the full-scale bridge by subspace partition and employed PointNet to classify the points in each subspace. Kim and Kim (2020) compared the performance of three DL models, PointNet, PointCNN, and dynamic graph CNN (DGCNN), in the classification of a point cloud of the bridge components and found that the mean interval over the unit of DGCNN was 86.85, which is higher than the others (see Fig. 10).

Fig. 10
figure 10

Identification results of points clouds in the research of Kim and Kim (2020)

3.2.3 External load

Moving vehicles are one of the main sources of live loads on bridges, and gathering their information is essential for SHM. Bridge weigh-in-motion that exploits bridge components, e.g., decks, girders, and vertical stiffeners, as weighting scales, is the most frequently adopted solution for this purpose, and DL brings efficient solutions for some of its drawbacks.

Zhang et al (2019b) proposed a novel methodology for the mission, in which a Faster R-CNN transferred from ImageNet was employed to detect different types of vehicles frame by frame. Multiple objects tracking algorithm tracked vehicles among different frames and generated the information sequence about each vehicle’s coordinate, type, lane number, and frame number. Then, the image calibration method based on moving standard vehicles was developed to calculate the vehicle length and speed. After acquiring the parameters, the spatiotemporal information could be obtained by vehicle location and the hypothesis of constant speed (see Fig. 11).

Fig. 11
figure 11

The framework for obtaining the spatiotemporal information of vehicles by Zhang et al (2019b)

However, the weight of vehicles cannot be obtained using the method proposed by Zhang et al (2019b). Jian et al (2019) combined CV with the influence line theory to acquire the time-spatial distribution of the vehicle loads on bridges. YOLO V3 was used to identify vehicle positions, types, and axle numbers. Then, vehicle weight was calculated by combining the strain influence line calibrated by field tests and the strain time-history. However, since only three scenarios of vehicle distribution were taken into consideration, the method may face obstacles in complicated traffic scenarios. To overcome this problem, a least square-based identification method that can utilize the redundant strain data measured by a network of strain sensors was proposed to distinguish complicated traffic modes and reduced the recognition errors through solving the overdetermined inverse influence equations (Pathirage et al 2019).

An approach for obtaining spatiotemporal information of vehicles on bridges based on 3D bounding box reconstruction was also proposed by Zhu et al (2021), in which CNN and YOLO were used to detect vehicles and get their 2D bounding box. A 3D bounding box reconstruction method based on the relationship between 2D and 3D bounding box was then developed to get the size and position of vehicles, and the spatiotemporal information of the vehicle could be finally obtained by using multiple objects tracking algorithm.

4 Application of DL in real bridges

The capability of DL encourages the exploration of various approaches that are able to overcome the challenges in traditional SHM, but most of them were verified just in simulation or laboratory. It cannot be denied that more details, like the platform used to collect images and the programs with user interface, need to be taken into consideration for promoting the application of these methods in practice (Xu 2018). This section summarized some efforts devoted to dealing with important details and the systems with DL that have been applied in actual structures.

A framework for autonomous bridge inspection using a UAV was proposed and applied to the Pahtajokk Bridge by Mirzazade et al (2021). Planning the most efficient flight path that could cover the damaged field with the minimum number of images was the first step. Then, three CNN models, SegNet, Inception v3, and U-Net, were trained to conduct bridge component detection, damage area recognition, and crack segmentation, respectively. The third step was to generate a dense point cloud for the damaged areas via intelligent hierarchical dense structure from motion and align it to the overall point cloud for the construction of the digital model of the bridge. Finally, damages were quantified based on the global coordinates of the detected damages.

Kruachottikul et al (2021) described a DL-based visual defect inspection system for reinforced concrete bridges, which consisted of four components. A mobile phone that could take photos was the first part. The second part identified images with defects via a modified ResNet-50, and the defects was classified using another modified ResNet-50 in the third part. Finally, damage severity was quantified by an ANN in the last part. The system’s accuracy for defect detection, classification, and severity prediction were 90.4%, 81%, and 78%, respectively, which had been accepted by Thailand’s Department of Highways for practical use.

Jang et al (2021) developed a ring-type climbing robot system composed of multiple cameras, a climbing robot, and a control computer. The raw images captured under close-up scanning conditions were proposed through feature control-based image stitching, DL-based semantic segmentation, and Euclidean distance transform-based crack quantification algorithms, based on which a digital crack map of the target bridge pier could be established. The test results conducted on the Jang-Duck bridge in South Korea revealed that the method successfully evaluated cracks of the bridge pier with a precision of 90.92% and recall of 97.47%.

Considering the difficulty to approach some parts of bridges by workforce, such as the bottom of decks, He et al (2022) proposed a smart unmanned surface vessel (USV) system for damage detection (see Fig. 12). A novel anchor-free network, CenWholeNet, which focused on center points and holistic information, was proposed, and a parallel attention module was introduced into the model innovatively. For the platform, a USV system without the global positioning systems (GPS) navigation, supporting real-time transmission of lidar and video information was designed.

Fig. 12
figure 12

The system developed by He et al (2022)

Vehicle-assisted monitoring is a promising alternative for rapid and low-cost bridge health monitoring compared with instrumentation installed on bridges. Sarwar and Cantero (2021) developed an indirect bridge monitoring system, in which a DAE was trained by the vertical acceleration responses of a fleet of vehicles passing over a healthy bridge. Then, the Kullback-Leibler divergence between the measured and the reconstructed signals was used for damage detection and severity quantification.

Mobile devices such as smartphones can be not only a sensing platform but also a computing platform to conduct on-site damage detection. However, due to the limited computing resources of mobile devices, the size of the DNN needs to be reduced. Ye et al (2022) developed pruned crack recognition network by reducing DNN size via the pruning method and designed a DL-based crack detection program for smartphones. In order to conduct crack detection by Internet of Things (IoT) devices in real-time, Kim et al (2021) proposed OleNet by fine-tuning the hyperparameters of LeNet-5. Compared with other pretrained DL models, including VGG16, Inception, and ResNet, OleNet achieved the maximum accuracy of 99.8% in the minimum computation. Shrestha and Dang (2020) developed a program integrated with CNN to realize accurate and real-time bridge vibration classification according to the multi-channel time-series signals acquired by the built-in accelerometers of smart phones.

5 Conclusions

In this paper, the applications of DL models in SHM, particularly damage detection of bridges, have been summarized systematically. It is easy to find that the excellent capability of DL models in addressing obstacles in the traditional SHM methods of the bridges has been demonstrated by the applications not only in laboratories but also in real bridges. Each of the DL models promotes the realization of a more intelligent SHM. However, it cannot be denied that drawbacks exist in every method. Some of the challenges can be listed as follows:

  1. 1.

    Most of the current studies consider only one type of monitoring data in damage detection. If this type of monitoring data is anormal, the damage detection will fail no matter how good the damage detection method is.

  2. 2.

    Although several attempts have been conducted to realize the targets by unsupervised learning, most of the applications still rely on pre-defined damage scenarios and training data, which pose a considerable requirement of engineering experience and labor.

  3. 3.

    The conditions of laboratories, where the majority of methods were validated, are idealized. The robustness of DL models needs to be further enhanced to combat environmental interference in practice, such as the vibration induced by external loads and motion blur when UAVs are employed.

  4. 4.

    The weak connection between the two levels of vision-based SHM results in difficulties in comprehensive condition assessment, for which visible defects and invisible damages need to be considered at the same time.

After considering the limitations listed above and recent achievements in DL, the following directions are promising and worthy to be further investigated:

  1. 1.

    Fusing multiple types of information collected by SHM system: With advances of multiple types of sensors, the SHM system can provide multiple types of structural information. Fusing and leveraging the multiple types of information in structural condition assessment via DL methods is a promising way to enhance the methods’ practicality.

  2. 2.

    Building larger training databases collected from the real world: Training DL models with the data containing actual interference is an efficient path to improve their robustness, and the availability of advanced sensors and UAVs nowadays makes it possible to build larger databases consisting of real samples.

  3. 3.

    Utilization of mobile and IoT devices: Mobile devices, such as smartphones, can be not only a sensing platform with various built-in sensors, including magnetometer, gyroscope, accelerometer, and GPS, but also a computing platform. Leveraging them by deploying lightweight DL models makes on-site damage detection available. In addition, the IoT devices, which emerge with the innovation in data transmission and cloud-based computation, provide an efficient way to obtain and integrate different types of structural data, which will prompt a cost-minimized and automatic SHM.

  4. 4.

    Digital twin: In order to make a reliable assessment reflecting the true condition of structural elements, an ensemble of multi-scale DL models is needed to interpret and integrate the data from both the local level and global level of SHM. Digital twin that tries to replicate physical entity in digital world (Lin et al 2021) provides a powerful platform for this mission, in which various damages can be reconstructed and evaluated at the same time. Integrating SHM and digital twin may be a promising way to realize the smart civil structure, even smart city.

Availability of data and materials

Data sharing not applicable to this article as no datasets were generated or analyzed during the current study.









Artificial Neural Network


Atrous Spatial Pyramid Pooling


Convolutional Neural Network


Computer Vision


Deep Autoencoder


Deep Belief Network


Dynamic Graph Convolutional Neural Network


Deep Learning


Deep Neural Network


Expected Maximum Attention


Empirical Mode Decomposition


Fully Connected Neural Network


Finite Element Model


Fast Fourier Transform


Generative Adversarial Network


Graph Neural Network


Global Positioning Systems


Gated Recurrent Unit


Hybrid Dilated Convolutional Block


Internet of Things


Long Short-Term Memory Network


Maximum A-Posteriori Probabilities


Machine Learning


Markov Transition Field


Recurrent Neural Network


Stacked Autoencoder


Structural Health Monitoring


Structural Health Monitoring at Global Level


Structural Health Monitoring at Local Level


Unmanned Aerial Vehicle


Unmanned Surface Vessel


Uniform Weights


Variational Mode Decomposition


You Only Look Once


  • Adeli H, Yeh C (1989) Perceptron learning in engineering design. Comput Aided Civ Inf 4(4):247–256

    Article  Google Scholar 

  • Ahmed H, La HM, Gucunski N (2020) Review of non-destructive civil infrastructure evaluation for bridges: state-of-the-art robotic platforms, sensors and algorithms. Sensor 20(14):3954

    Article  Google Scholar 

  • Alazzawi O, Wang D (2022) A novel structural damage identification method based on the acceleration responses under ambient vibration and an optimized deep residual algorithm. Struct Health Monit 21(6):2587–2617

    Article  Google Scholar 

  • Ali R, Cha Y (2019) Subsurface damage detection of a steel bridge using deep learning and uncooled micro-bolometer. Constr Build Mater 226:376–387

    Article  Google Scholar 

  • Azimi M, Eslamlou AD, Pekcan G (2020) Data-driven structural health monitoring and damage detection through deep learning: state-of-the-art review. Sensors 20(10):2778

    Article  Google Scholar 

  • Bae H, Jang K, An YK (2021) Deep super resolution crack network (SrcNet) for improving computer vision-based automated crack detectability in in situ bridges. Struct Health Monit 20(4):1428–1442

    Article  Google Scholar 

  • Bao Y, Li H (2021) Machine learning paradigm for structural health monitoring. Struct Health Monit 20(4):1353–1372

    Article  Google Scholar 

  • Byung K, Branko G, Yousok K, Hyo S (2020) Convolutional neural network–based data recovery method for structural health monitoring. Struct Health Monit 19(6):1821–1838

    Article  Google Scholar 

  • Cha Y, Choi W, Suh G, Mahmoudkhani S, Büyüköztürk O (2018) Autonomous structural visual inspection using region-based deep learning for detecting multiple damage types. Comput Aided Civ Inf 3(9):731–747

    Article  Google Scholar 

  • Chen R (2021) Migration learning-based bridge structure damage detection algorithm. Sci Program Neth 2021:1102521

    Google Scholar 

  • Chen Z, Li H, Bao Y (2019) Analyzing and modeling inter-sensor relationships for strain monitoring data and missing data imputation: a copula and functional data-analytic approach. Struct Health Monit 18(4):1168–1188

    Article  Google Scholar 

  • Chen Z, Wang Y, Wu J, Deng C, Hu K (2021) Sensor data-driven structural damage detection based on deep convolutional neural networks and continuous wavelet transform. Appl Intell 51(8):5598–5609

    Article  Google Scholar 

  • Cui M, Wu G, Chen Z, Dang J, Zhou M, Feng D (2021a) Geometric attention regularization enhancing convolutional neural networks for bridge rubber bearing damage assessment. J Perform Constr Facil 35(5):04021061

    Article  Google Scholar 

  • Cui X, Wang Q, Dai J, Zhang R, Li S (2021b) Intelligent recognition of erosion damage to concrete based on improved YOLO-v3. Mater Lett 302:130363

    Article  Google Scholar 

  • Dang HV, Tran-Ngoc H, Nguyen TV, Bui-Tien T, De Roeck G, Nguyen HX (2021) Data-driven structural health monitoring using feature fusion and hybrid deep learning. IEEE T Autom Sci Eng 18(4):2087–2103

    Article  Google Scholar 

  • Deng G, Zhou Z, Chu X, Shao S (2020c) Identification of behavioral features of bridge structure based on static image sequences. Adv Civ Eng 2020:2815017

    Google Scholar 

  • Deng J, Lu Y, Lee VCS (2020a) Concrete crack detection with handwriting script interferences using faster region-based convolutional neural network. Comput Aided Civ Inf 35(4):373–388

    Article  Google Scholar 

  • Deng J, Lu Y, Lee VCS (2021) Imaging-based crack detection on concrete surfaces using you only look once network. Struct Health Monit 20(2):484–499

    Article  Google Scholar 

  • Deng W, Mou Y, Kashiwa T, Escalera S, Nagai K, Nakayama K, Matsuo Y, Prendinger H (2020b) Vision based pixel-level bridge structural damage detection using a link ASPP network. Autom Constr 110:102973

    Article  Google Scholar 

  • Dong C, Catbas FN (2021) A review of computer vision-based structural health monitoring at local and global levels. Struct Health Monit 20(2):692–743

    Article  Google Scholar 

  • Du Y, Li L, Hou R, Wang X, Tian W, Xia Y (2022) Convolutional neural network-based data anomaly detection considering class imbalance with limited data. Smart Struct Syst 29(1):63–75

    Google Scholar 

  • Duan Y, Chen Q, Zhang H, Yun C, Wu S, Zhu Q (2019) CNN-based damage identification method of tied-arch bridge using spatial-spectral information. Smart Struct Syst 23(5):507–520

    Google Scholar 

  • Ebenezer AS, Kanmani SD, Sheela V, Ramalakshmi K, Chandran V, Sumithra MG, Elakkiya B, Murugesan B (2021) Identification of civil infrastructure damage using ensemble transfer learning model. Adv Civ Eng 2021:5589688

    Google Scholar 

  • Fan G, Li J, Hao H (2019) Lost data recovery for structural health monitoring based on convolutional neural networks. Struct Contrl Hlth 26(10):e2433

    Google Scholar 

  • Farrar CR, Worden K (2012) Structural health monitoring: a machine learning perspective. Wiley, New York

    Book  Google Scholar 

  • Fu L, Tang Q, Gao P, Xin J, Zhou J (2021) Damage identification of long-span bridges using the hybrid of convolutional neural network and long short-term memory network. Algorithms 14(6):180

    Article  Google Scholar 

  • Guo Q, Feng L, Zhang R, Yin H (2020) Study of damage identification for bridges based on deep belief network. Adv Struct Eng 23(8):1562–1572

    Article  Google Scholar 

  • Han JH, Kim IS, Lee CH, Moon YS (2020) Crack detection method for tunnel lining surfaces using ternary classifier. KSII T Internet Inf 14(9):3797–3822

    Google Scholar 

  • He H, Zheng J, Liao L, Chen Y (2021b) Damage identification based on convolutional neural network and recurrence graph for beam bridge. Struct Health Monit 20(4):1392–1408

    Article  Google Scholar 

  • He Y, Chen H, Liu D, Zhang L (2021a) A framework of structural damage detection for civil structures using fast fourier transform and deep convolutional neural networks. Appl Sci Basel 11(19):9345

    Article  Google Scholar 

  • He Z, Jiang S, Zhang J, Wu G (2022) Automatic damage detection using anchor-free method and unmanned surface vessel. Autom Constr 133:104017

    Article  Google Scholar 

  • Holm E, Transeth AA, Knudsen OO, Stahl A (2020) Classification of corrosion and coating damages on bridge constructions from images using convolutional neural networks. In: 12th international conference on machine vision (ICMV 2019) 11433, p 1143320

    Google Scholar 

  • Hormozabad SJ, Soto MG (2021) Real-time damage identification of discrete structures via neural networks subjected to dynamic loading. In: Conference on Health Monitoring of Structural and Biological Systems XV, 115932, p 115932O

    Google Scholar 

  • Housner GW, Bergman LA, Caughey TK, Chassiakos AG, Claus RO (1997) Structural control: past, present, and future. J Eng Mech 123(9):897–971

    Google Scholar 

  • Huynh TC (2021) Vision-based autonomous bolt-looseness detection method for splice connections: design, lab-scale evaluation, and field application. Autom Constr 124:103591

    Article  Google Scholar 

  • Huynh TC, Park JH, Jung HJ, Kim JT (2019) Quasi-autonomous bolt-loosening detection method using vision-based deep learning and image processing. Autom Constr 105:102844

    Article  Google Scholar 

  • Ibrahim A, Eltawil A, Na Y, El-Tawil S (2020) A machine learning approach for structural health monitoring using noisy data sets. IEEE T Autom Sci Eng 17(2):900–908

    Article  Google Scholar 

  • James MWB, Alessandro DS, Xu YL, Helmut W, Emin A (2011) Vibration-based monitoring of civil infrastructure: challenges and successes. J Civ Struct Health 1:79–95

    Article  Google Scholar 

  • Jang K, An YK, Kim B, Cho S (2021) Automated crack evaluation of a high-rise bridge pier using a ring-type climbing robot. Comput Aided Civ Inf 36(1):14–29

    Article  Google Scholar 

  • Jena SP, Parhi DR (2020) Fault detection in cracked structures under moving load through a recurrent-neural-networks-based approach. Sci Iran 27(4):1886–1896

    Google Scholar 

  • Jeong E, Seo J, Wacker J (2020) Literature review and technical survey on bridge inspection using unmanned aerial vehicles. J Perform Constr Facil 34(6):04020113

    Article  Google Scholar 

  • Jeong S, Ferguson M, Hou R, Lynch JP, Sohn H, Law KH (2019) Sensor data reconstruction using bidirectional recurrent neural network with application to bridge monitoring. Adv Eng Inform 42:100991

    Article  Google Scholar 

  • Jian X, Xia Y, Lozano-Galant JA, Sun L (2019) Traffic sensing methodology combining influence line theory and computer vision techniques for girder bridges. J Sensors 2019:3409525

    Article  Google Scholar 

  • Jian X, Zhong H, Xia Y, Sun L (2021) Faulty data detection and classification for bridge structural health monitoring via statistical and deep-learning approach. Struct Control Health Monit 28(11):e2824

    Article  Google Scholar 

  • Jiang H, Wan C, Yang K, Ding Y, Xue S (2021a) Continuous missing data imputation with incomplete dataset by generative adversarial networks-based unsupervised learning for long-term bridge health monitoring. Struct Health Monit 21(3):1093–1109

    Article  Google Scholar 

  • Jiang W, Liu M, Peng Y, Wu L, Wang Y (2021b) HDCB-net: a neural network with the hybrid dilated convolution for pixel-level crack detection on concrete bridges. IEEE Ind Inform 17(8):5485–5494

    Article  Google Scholar 

  • Karim MM, Qin R, Chen G, Yin Z (2021) A semi-supervised self-training method to develop assistive intelligence for segmenting multiclass bridge elements from inspection videos. Struct Health Monit 21(3):835–852

    Article  Google Scholar 

  • Khodabandehlou H, Pekcan G, Fadali MS (2019) Vibration-based structural condition assessment using convolution neural networks. Struct Control Hlth 26(2):e2308

    Google Scholar 

  • Kim B, Yuvaraj N, Preethaa KRS, Pandian RA (2021) Surface crack detection using deep learning with shallow CNN architecture for enhanced computation. Neural Comput & Applic 33(15):9289–9305

    Article  Google Scholar 

  • Kim H, Kim C (2020) Deep-learning-based classification of point clouds for bridge inspection. Remote Sens 12(22):3757

    Article  Google Scholar 

  • Kim H, Yoon J, Sim SH (2020) Automated bridge component recognition from point clouds using deep learning. Struct Control Hlth 27(9):e2591

    Article  Google Scholar 

  • Kruachottikul P, Cooharojananone N, Phanomchoeng G, Chavarnakul T, Kovitanggoon K, Trakulwaranont D (2021) Deep learning-based visual defect-inspection system for reinforced concrete bridge substructure: a case of Thailand’s department of highways. J Civ Struct Health 11(4):949–965

    Article  Google Scholar 

  • Lee JS, Hwang SH, Choi IY, Choi Y (2020b) Estimation of crack width based on shape-sensitive kernels and semantic segmentation. Struct Control Hlth 27(4):e2504

    Article  Google Scholar 

  • Lee JS, Kim HM, Kim SI, Lee HM (2021) Evaluation of structural integrity of railway bridge using acceleration data and semi-supervised learning approach. Eng Struct 239:112330

    Article  Google Scholar 

  • Lee K, Byun N, Shin DH (2020a) A damage localization approach for rahmen bridge based on convolutional neural network. KSCE J Civ Eng 24(1):1–9

    Article  Google Scholar 

  • Li D, Ho SC, Song G, Ren L, Li H (2015) A review of damage detection methods for wind turbine blades. Smart Mater Struct 24(3):033001

    Article  Google Scholar 

  • Li D, Liang Z, Ren W, Yang D, Wang S, Xiang S (2021c) Structural damage identification under nonstationary excitations through recurrence plot and multi-label convolutional neural network. Measurement 186:110101

    Article  Google Scholar 

  • Li L, Liu G, Zhang L, Li Q (2021a) FS-LSTM-based sensor fault and structural damage isolation in SHM. IEEE Sensors J 21(3):3250–3259

    Article  Google Scholar 

  • Li L, Zhou H, Liu H, Zhang C, Liu J (2021b) A hybrid method coupling empirical mode decomposition and a long short-term memory network to predict missing measured signal data of SHM systems. Struct Health Monit 20(4):1778–1793

    Article  Google Scholar 

  • Li S, Niu J, Li Z (2021d) Novelty detection of cable-stayed bridges based on cable force correlation exploration using spatiotemporal graph convolutional networks. Struct Health Monit 20(4):2216–2228

    Article  Google Scholar 

  • Li S, Sun L (2020) Detectability of bridge-structural damage based on fiber-optic sensing through deep-convolutional neural networks. J Bridg Eng 25(4):04020012

    Article  MathSciNet  Google Scholar 

  • Li S, Zuo X, Li Z, Wang H (2020) Applying deep learning to continuous bridge deflection detected by fiber optic gyroscope for damage detection. Sensors 20(3):911

    Article  Google Scholar 

  • Lin K, Xu YL, Lu X, Guan Z, Li J (2021) Digital twin-based collapse fragility assessment of a long-span cable-stayed bridge under strong earthquakes. Autom Constr 123:103547

    Article  Google Scholar 

  • Liu G, Niu Y, Zhao W, Duan Y, Shu J (2022) Data anomaly detection for structural health monitoring using a combination network of GANomaly and CNN. Smart Struct Syst 29(1):53–62

    Google Scholar 

  • Liu H, Ding Y, Zhao H, Wang M, Geng F (2020) Deep learning-based recovery method for missing structural temperature data using LSTM network. Struct Monit Maint 7(2):109–124

    Google Scholar 

  • Lu W, Teng J, Li C, Cui Y (2017) Reconstruction to sensor measurements based on a correlation model of monitoring data. Appl Sci 7(3):243

    Article  Google Scholar 

  • Ma X, Lin Y, Nie Z, Ma H (2020) Structural damage identification based on unsupervised feature-extraction via variational auto-encoder. Measurement 160:107811

    Article  Google Scholar 

  • Mangalathu S, Jeon JS (2020) Ground motion-dependent rapid damage assessment of structures based on wavelet transform and image analysis techniques. J Struct Eng 146(11):04020230

    Article  Google Scholar 

  • Mantawy IM, Mantawy MO (2022) Convolutional neural network based structural health monitoring for rocking bridge system by encoding time-series into images. Struct Control Hlth 29(3):e2897

    Article  Google Scholar 

  • Miao X, Wang J, Wang Z, Sui Q, Gao Y, Jiang P (2019) Automatic recognition of highway tunnel defects based on an improved u-net model. IEEE Sensors J 19(23):11413–11423

    Article  Google Scholar 

  • Mirzazade A, Popescu C, Blanksvard T, Taljsten B (2021) Workflow for off-site bridge inspection using automatic damage detection-case study of the pahtajokk bridge. Remote Sens 13(14):2665

    Article  Google Scholar 

  • Mondal TG, Jahanshahi MR, Wu RT, Wu Z (2020) Deep learning-based multi-class damage detection for autonomous post-disaster reconnaissance. Struct Control Hlth 27(4):e2507

    Google Scholar 

  • Mosalam K, Muin S, Gao Y (2019) New directions in structural health monitoring. NED Univ J Res 2:77–112

    Article  Google Scholar 

  • Mousavi M, Gandomi AH (2021a) Prediction error of Johansen cointegration residuals for structural health monitoring. Mech Syst Singal Pr 160:107847

    Article  Google Scholar 

  • Mousavi M, Gandomi AH (2021b) Structural health monitoring under environmental and operational variations using MCD prediction error. J Sound Vib 512:116370

    Article  Google Scholar 

  • Nguyen DH, Nguyen QB, Bui-Tien T, De Roeck G, Wahab MA (2020) Damage detection in girder bridges using modal curvatures gapped smoothing method and convolutional neural network: application to Bo Nghi bridge. Thero Appl Fract Mec 109:102728

    Article  Google Scholar 

  • Ni F, Zhang J, Chen Z (2019) Zernike-moment measurement of thin-crack width in images enabled by dual-scale deep learning. Comput Aided Civ Inf 34(5):367–384

    Article  Google Scholar 

  • Pal M, Palevicius P, Landauskas M, Orinaite U, Timofejeva I, Ragulskis M (2021) An overview of challenges associated with automatic detection of concrete cracks in the presence of shadows. Appl Sci Basel 11(23):11396

    Article  Google Scholar 

  • Pan H, Azimi M, Gui G, Yan F, Lin Z (2017) Vibration-based support vector machine for structural health monitoring. In: International Conference on Experimental Vibration Analysis for Civil Engineering Structures, pp 167–178

    Google Scholar 

  • Pan H, Azimi M, Yan F, Lin Z (2018) Time-frequency-based data-driven structural diagnosis and damage detection for cable-stayed bridges. J Bridg Eng 23(6):04018033

    Article  Google Scholar 

  • Pathirage CSN, Li J, Li L, Hao H, Liu W, Ni P (2018) Structural damage identification based on autoencoder neural networks and deep learning. Eng Struct 172:13–28

    Article  Google Scholar 

  • Pathirage CSN, Li J, Li L, Hao H, Liu W, Wang R (2019) Development and application of a deep learning–based sparse autoencoder framework for structural damage identification. Struct Health Monit 18(1):103–122

    Article  Google Scholar 

  • Perez H, Tah JHM, Mosavi A (2019) Deep learning for detecting building defects using convolutional neural networks. Sensors 19(16):3556

    Article  Google Scholar 

  • Qiao W, Ma B, Liu Q, Wu X, Li G (2021) Computer vision-based bridge damage detection using deep convolutional networks with expectation maximum attention module. Sensors 21(3):824

    Article  Google Scholar 

  • Quqa S, Martakis P, Movsessian A, Pai S, Reuland Y, Chatzi E (2022) Two-step approach for fatigue crack detection in steel bridges using convolutional neural networks. J Civ Struct Health 12(1):127–140

    Article  Google Scholar 

  • Rastin Z, Amiri GG, Darvishan E (2021a) Generative adversarial network for damage identification in civil structures. Shock Vib 2021:3987835

    Google Scholar 

  • Rastin Z, Amiri GG, Darvishan E (2021b) Unsupervised structural damage detection technique based on a deep convolutional autoencoder. Shock Vib 2021:6658575

    Google Scholar 

  • Rubio JJ, Kashiwa T, Laiteerapong T, Deng W, Nagai K, Escalera S, Nakayama K, Matsuo Y, Prendinger H (2019) Multi-class structural damage segmentation using fully convolutional networks. Comput Ind 112: 103121

    Article  Google Scholar 

  • Sajedi SO, Liang X (2019) A convolutional cost-sensitive crack localization algorithm for automated and reliable RC bridge inspection. In: Risk-based bridge engineering: proceedings of the 10th New York City bridge conference 2019, p 229

    Chapter  Google Scholar 

  • Sarwar MZ, Cantero D (2021) Deep autoencoder architecture for bridge damage assessment using responses from several vehicles. Eng Struct 246:113064

    Article  Google Scholar 

  • Savino P, Tondolo F (2021) Automated classification of civil structure defects based on convolutional neural network. Front Struct Civ Eng 15(2):305–317

    Article  Google Scholar 

  • Shajihan S, Wang S, Zhai G, Spencer BF (2022) CNN based data anomaly detection using multi-channel imagery for structural health monitoring. Smart Struct Syst 29(1):181–193

    Google Scholar 

  • Shao Y, Li L, Li J, An S, Hao H (2021) Computer vision based target-free 3D vibration displacement measurement of structures. Eng Struct 246:113040

    Article  Google Scholar 

  • Shao Y, Li L, Li J, An S, Hao H (2022) Target-free 3D tiny structural vibration measurement based on deep learning and motion magnification. J Sound Vib 538(10):117244

    Article  Google Scholar 

  • Sharma S, Sen S (2020) One-dimensional convolutional neural network-based damage detection in structural joints. J Civ Struct Health 10(5):1057–1072

    Article  Google Scholar 

  • Shin HK, Ahn YH, Lee SH, Kim HY (2020) Automatic concrete damage recognition using multi-level attention convolutional neural network. Materials 13(23):5549

    Article  Google Scholar 

  • Shrestha A, Dang J (2020) Deep learning-based real-time auto classification of smartphone measured bridge vibration data. Sensor 20(9):2710

    Article  Google Scholar 

  • Silva M, Santos A, Santos R, Figueiredo E, Sales C, Costa JC (2019) Deep principal component analysis: an enhanced approach for structural damage identification. Struct Health Monit 18(5–6):1444–1463

    Article  Google Scholar 

  • Silva MF, Santos A, Santos R, Figueiredo E, Costa JCWA (2021) Damage-sensitive feature extraction with stacked autoencoders for unsupervised damage detection. Struct Control Hlth 28(5):e2714

    Article  Google Scholar 

  • Sofi A, Regita JJ, Rane B, Lau HH (2022) Structural health monitoring using wireless smart sensor network-an overview. Mech Syst Signal Pr 163:108113

    Article  Google Scholar 

  • Son H, Pham VT, Jang Y, Kim SE (2021) Damage localization and severity assessment of a cable-stayed bridge using a message passing neural network. Sensors 21(9):3118

    Article  Google Scholar 

  • Song Q, Chen Y, Oskoui EA, Fang Z, Taylor T (2020) Micro-crack detection method of steel beam surface using stacked autoencoders on massive full-scale sensing strains. Struct Health Monit 19(4):1175–1187

    Article  Google Scholar 

  • Sony S, Gamage S, Sadhu A, Samarabandu J (2022) Vibration-based multiclass damage detection and localization using long short-term memory networks. Structures 35:436–451

    Article  Google Scholar 

  • Sun L, Shang Z, Xia Y, Bhowmick S, Nagarajaiah S (2020) Review of bridge structural health monitoring aided by big data and artificial intelligence: from condition assessment to damage detection. J Struct Eng 146(5):04020073

    Article  Google Scholar 

  • Tang Z, Chen Z, Bao Y, Li H (2019) Convolutional neural network-based data anomaly detection method using multiple information for structural health monitoring. Struct Control Health Monit 26(1):e2296

    Article  Google Scholar 

  • Teng S, Chen G, Liu Z, Cheng L, Sun X (2021) Multi-sensor and decision-level fusion-based structural damage detection using a one-dimensional convolutional neural network. Sensors 21(12):3950

    Article  Google Scholar 

  • Teng Z, Teng S, Zhang J, Chen G, Cui F (2020) Structural damage detection based on real-time vibration signal and convolutional neural network. Appl Sci 10(14):4720

    Article  Google Scholar 

  • Wang D, Zhang Y, Pan Y, Peng B, Liu H, Ma R (2020) An automated inspection method for the steel box girder bottom of long-span bridges based on deep learning. IEEE Access 8:94010–94023

    Article  Google Scholar 

  • Wang R, Chencho ASJ, Li J, Hao H, Liu W (2021) Deep residual network framework for structural health monitoring. Struct Health Monit 20(4):1443–1461

    Article  Google Scholar 

  • Wang W, Su C (2022) Automatic classification of reinforced concrete bridge defects using the hybrid network. Arab J Sci Eng 47(4):5187–5197

    Article  Google Scholar 

  • Xiao H, Wang W, Dong L, Ogai H (2021) A novel bridge damage diagnosis algorithm based on deep learning with gray relational analysis for intelligent bridge monitoring system. IEEJ T Electr 16(5):743–753

    Google Scholar 

  • Xiao X, Xu YL, Zhu Q (2015) Multi-scale modelling and model updating of a cable-stayed bridge, part II: model updating using modal frequencies and influence lines. J Bridg Eng 20(10):04014113

    Article  Google Scholar 

  • Xu YL (2018) Making good use of structural health monitoring systems of long-span cable-supported bridges. J Civ Struct Health 8(3):477–497

    Article  Google Scholar 

  • Xu YL, Xia Y (2012) Structural health monitoring of long-span suspension bridges. Spon Press (Taylor& Francis), UK

    Google Scholar 

  • Xu Y, Zhang J, Brownjohn J (2021) An accurate and distraction-free vision-based structural displacement measurement method integrating Siamese network-based tracker and correlation-based template matching. Measurement 179:109506

    Article  Google Scholar 

  • Yang J, Zhang L, Chen C, Li Y, Li R, Wang G, Jiang S, Zeng Z (2020) A hierarchical deep convolutional neural network and gated recurrent unit framework for structural damage detection. Inf Sci 540:117–130

    Article  Google Scholar 

  • Yang K, Ding Y, Sun P, Jiang H, Wang Z (2021) Computer vision-based crack width identification using F-CNN model and pixel nonlinear calibration. Struct Infrastruct E 2021:1994617

  • Yang S, Huang Y (2021) Damage identification method of prestressed concrete beam bridge based on convolutional neural network. Neural Comput & Applic 33(2):535–545

    Article  MathSciNet  Google Scholar 

  • Yao Y, Tung STE, Glisic B (2014) Crack detection and characterization techniques-An overview. Struct Control Hlth 21(12):1387–1413

    Article  Google Scholar 

  • Ye X, Jin T, Chen PY (2019) Structural crack detection using deep learning-based fully convolutional networks. Adv Struct Eng 22(16):3412–3419

    Article  Google Scholar 

  • Ye X, Li Z, Jin T (2022) Smartphone-based structural crack detection using pruned fully convolutional networks and edge computing. Smart Struct Syst 29(1):141–151

    Google Scholar 

  • Yin H, Gai K (2015) An empirical study on preprocessing high-dimensional class-imbalanced data for classification. In: 2015 IEEE 17th international conference on high performance computing and communications, 2015 IEEE 7th international symposium on cyberspace safety and security, and 2015 IEEE 12th international conference on embedded software and systems, pp 1314–1319

    Chapter  Google Scholar 

  • Zhang B, Zhou L, Zhang J (2019b) A methodology for obtaining spatiotemporal information of the vehicles on bridges based on computer vision. Comput Aided Civ Inf 34(6):471–487

    Article  Google Scholar 

  • Zhang C, Chang C, Jamshidi M (2020) Concrete bridge surface damage detection using a single-stage detector. Comput Aided Civ Inf 35(4):389–409

    Article  Google Scholar 

  • Zhang C, Tian Y, Zhang J (2021) Complex image background segmentation for cable force estimation of urban bridges with drone-captured video and deep learning. Struct Contrl Hlth 29(4):e2910

    Google Scholar 

  • Zhang L, Shen J, Zhu B (2022a) A review of the research and application of deep learning-based computer vision in structural damage detection. Earthq Eng Vib 21(1):1–21

    Article  Google Scholar 

  • Zhang Y, Lei Y (2021) Data anomaly detection of bridge structures using convolutional neural network based on structural vibration signals. Symmetry-Basel 13(7):1186

    Article  Google Scholar 

  • Zhang Y, Miyamori Y, Mikami S, Saito T (2019a) Vibration-based structural state identification by a 1-dimensional convolutional neural network. Comput Aided Civ Inf 34(9):822–839

    Article  Google Scholar 

  • Zhang Z, Yan J, Li L, Pan H, Dong C (2022b) Condition assessment of stay cables through enhanced time series classification using a deep learning approach. Smart Struct Syst 29(1):105–116

    Google Scholar 

  • Zhao R, Yan R, Chen Z, Mao K, Wang P, Gao R (2019) Deep learning and its applications to machine health monitoring. Mech Syst Signal Pr 115:213–237

    Article  Google Scholar 

  • Zhu J, Li X, Zhang C, Shi T (2021) An accurate approach for obtaining spatiotemporal information of vehicle loads on bridges based on 3D bounding box reconstruction with computer vision. Measurement 181:109657

    Article  Google Scholar 

  • Zhu J, Zhang C, Qi H, Lu Z (2020) Vision-based defects detection for bridges using transfer learning and convolutional neural networks. Struct Infrastruct E 16(7):1037–1049

    Article  Google Scholar 

  • Zhu Q, Xu YL, Xiao X (2015) Multi-scale modelling and model updating of a cable-stayed bridge, part I: modelling and influence line analysis. J Bridg Eng 20(10):04014112

    Article  Google Scholar 

  • Zhu Z, German S, Brilakis I (2010) Detection of large-scale concrete columns for automated bridge inspection. Autom Constr 19(8):1047–1055

    Article  Google Scholar 

Download references


The works described in this paper are financially supported by the Changjiang Scholars Program of the Ministry of Education of China (SWJTU-YH1199911012201) to which the authors are most grateful. Any opinions and conclusions presented in this paper are entirely those of the authors.


The Changjiang Scholars Program of the Ministry of Education of China: SWJTU-YH1199911012201.

Author information

Authors and Affiliations



Guoqing Zhang reviewed the correlated literature and drafted the manuscript. Bin Wang and Jun Li proposed the frame of the paper and provided worthy advice. You-lin Xu revised the manuscript and advised the summary and prospects. The author(s) read and approved the final manuscript.

Corresponding author

Correspondence to You-Lin Xu.

Ethics declarations

Competing interests

You-lin Xu is an honorary adviser and Bin Wang is a manager editor for Advances in Bridge Engineering. They were not involved in the editorial review, or the decision to publish this article. All authors declare that there are no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Zhang, GQ., Wang, B., Li, J. et al. The application of deep learning in bridge health monitoring: a literature review. ABEN 3, 22 (2022).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI:


  • Deep learning
  • Bridge health monitoring
  • Damage detection
  • Computer vision