Fracture acoustic emission signals identification of stay cables in bridge engineering application using deep transfer learning and wavelet analysis

Stay cables are typically exposed to the environment and traffic loading leading to degradations due to corrosion and cyclic loading after years’ in service. A non-destructive method to detect the defects of cables as early as possible is needed and important for adequate large-span bridge maintenance. Use of a status-driven acoustic emission (AE) monitoring Convolutional neural network (CNN) method is investigated by combing wavelet analysis and transfer deep learning. CNN is used to construct the relationship between AE signals’ scalograms and cable status. The trained CNN is suitable to identify the in-situ monitored signals and evaluate the current status of cables during the operation of a bridge. As a pilot study, the binary AE signals classification CNN is implemented to identify noise & fracture AE signals in static tests of a stay-cable. Accuracy of the method is investigated. In addition, the trained model is examined using AE signals which are not used in the machine learning to check possible improvements of the accuracy. Expectations in recognition of results and status-driven monitoring potentials are addressed in the paper.


Introduction
Stay cables and high-strength steel wires, are the most critical load-carrying members in a variety of civil structures. In bridge engineering application, stay cables are highly vulnerable to long-term effects (Li et al., 2012a) (i.e. fatigue), vibration effects (Li & Ou, 2016) (i.e. wind-and rain-induced vibrations), and environmental actions (Li et al., 2012b) (i.e. corrosion) after years of service. Early detection and real-time monitoring of fracture of stay cables is critically necessary to prevent a complete collapse of bridges and thus ensure the safety of a bridge.
Various detection methods for damage in bridge cables are available, including visual inspection, cable stress measurement (Lin et al., 2017;Zarbaf et al., 2017), magnetic flux leakage (MFL) testing (Xu et al., 2012), X-ray testing, and ultrasonic testing (Rizzo & Lanza Di Scalea, 2005). Acoustic emission (AE) technique is one of typical nondestructive technique, and currently developed to provide real-time monitoring of a growing structure damage (Pomponi & Vinogradov, 2013;Bianchi et al., 2015;Feng et al., 2018;Cheng et al., 2019). This technique has proven to be effective in detecting damage and its location. A substantial effort, including non-iterative algorithms and iterative algorithms, has been devoted to study the location of the acoustic emission source. The applicable results were reported for both the same medium and different media (Zhou et al., 2017). The successful application of AE on source identification and localization inspired researchers to adapt the technique for damage monitoring and early damage warning of stay cables. Several studies have been performed to demonstrate the relationship between the characteristics of AE signals and wire fracture of cables under fatigue loading or corrosion.
However, signal identification from sensors used to locate the damage position is difficult due to Kaiser effects (Choi et al., 2005), signals attenuation (Qian et al., 2016), signal reflection (Ebrahimkhanlou & Salamone, 2018) and environment noise (Ebrahimkhanlou & Salamone, 2018). To identify the original signal, it is necessary to understand the types of noise sources and to ensure the elimination of their influence. Traditional AE signals classification methods are based on AE events (i.e. amplitude, rise time, energy, counts etc.) and the damage signals are identified by setting AE events threshold. The performance of threshold method is strongly dependent on the choice of the selected threshold value. It is noted that early triggering or missing true arrival time could occur with arbitrarily set threshold value. Such classification ability is relatively limited leading to incorrect damage location of stay cables.
Artificial intelligence (AI) and machining learning (ML) technologies are developing very fast, especially the deep learning (DL) in computer vision is making a giant progress (Krizhevsky et al., 2012;Mayrbaurl & Camo, 2004). The implementation of artificial neural networks (ANNs) with ML or DL method enables computers to perform time-consuming and labor-intensive identification by learning from experience. A number of studies have introduced DL techniques could effectively improve the classification accuracy in civil engineering application (Gao & Mosalam, 2018;Wang et al., 2018;Kerh et al., 2017). As one of the DL techniques, convolutional neural network (CNN) was developed to solve handwritten-digits recognition tasks around 1990s (LeCun et al., 1989). Its recent wide applications are attributed to the great development of computer hardware and the boosting from ImageNet Visual Recognition Challenge (Simonyan & Zisserman, 2014). Compared with BP neural network-based shallow learning, no manual feature extraction are required in CNN-based deep learning (Kerh et al., 2017;Rafiei et al., 2017). CNN models are becoming increasingly deeper to improve performance, achieve high accuracy, robustness, and adaptability in image classification (Krizhevsky et al., 2012;Mayrbaurl & Camo, 2004). However, building new CNNs requires time and effort, especially in hyper-parameters optimization. A hyperparameters search can take weeks or even months for deep CNNs. The big data is required to configure and optimize the hyper-parameters at each epoch in the training process. Hence, building a new CNN from beginning is challenging. Transfer learning (TL) is a new ML technique, which could use knowledge from source domains to target domains (Yosinski et al., 2014). A model developed for a task is reused as the starting point for a model on a second task in DL to relax the big data requirement for training a deep CNN (Pan & Yang, 2010;Oquab et al., 2014;Bengio, 2012).
Due to the high performance of TL CNNs in image classification, using TL CNNs to classify AE signals is promising. Based on the above research, this paper proposes a status-driven AE monitoring method by combing wavelet scalograms (Klee & Allen, 2018) and TL CNNs to identify status and remaining fatigue life of stay-cables. As is shown in Fig. 1, the signal characteristics and the remaining fatigue life are obtained via fatigue tests on cables. The AE signals in the time domain corresponding to different damage status are converted to the time-frequency domain via continuous wavelet transform (CWT) (Pan & Yang, 2010) to get scalograms image dataset. CNN is employed to construct the relationship between AE signals and cable status. The trained CNN will be used to identify the signals obtained through in-situ monitoring and evaluate the current status of cables in the servable bridges. Finally, the department of bridge maintenance and management will make decisions on the replacement of cables based on reported status.
In this paper, a 6-strands stay-cable was axially loaded and monitored using AE technique until the failure. The AE signals of noise & fracture are collected and then identified by analyzing experimental results and AE signal features. A binary AE signal classification is achieved by using wavelet transform and TL CNNs based on GoogLe-Net. The trained CNN model is validated by the AE signals which are not used in the training. The study demonstrates the advantage and potential of the proposed statusdriven AE monitoring method to diagnose and monitor the damage of stay-cables in real-time.

Image datasets based on wavelet transform
Wavelets are mathematical functions that cut signals into different frequency band, and then study each frequency band with a resolution matched to its scale (Ricker, 1953). The continuous wavelet transform (CWT) (Pan & Yang, 2010), which could provide an Fig. 1 Status-driven AE monitoring methods over-complete representation of the acoustic emission signals by translating and scaling parameter of the wavelets vary continuously, is employed to decompose the signals into frequency bands, and then achieve local time information via correlated resolution. Given a time series f(t), the continuous wavelet transform is expressed by the following integral: where: f(t) is the monitoring signals in the time domain; ψ(t) is the analyzing wavelet, a continuous function in time-frequency domain; ψðtÞ with the over-line represents the complex conjugation of ψ(t); s is the scale factor and its inverse is corresponding to the frequency; τ represents time shift or translation (Pan & Yang, 2010). The position of the wavelet in the time domain is given by τ and its position in the frequency domain is given by s. Therefore, the wavelet transform, by mapping the original series into a function of τ and s, gives us information simultaneously on time and frequency (Sejdic & Djurovic, 2008). In particular, the Morlet wavelet (Pan & Yang, 2010) is employed in this paper and expressed as below: where: κ σ is the parameter used to satisfy the admissibility criterion, c σ is the normalization constant, defined as below: The time-frequency characteristics of acoustic emission signals are visualized by scalogram after wavelet transform, where the x-axis represents time and the y-axis represents scale, while the frequency coefficient value is shown by varying the color. The scalogram is the equivalent of the spectrogram for wavelets and can be used to identifying instantaneous frequency. The scalograms ( Fig. 2-a), which are used for training and validating CNNs, are created as RGB images as the datasets. The image with N × M pixel resolution is input into a computer, there are N × M × 3 numbers according to RGB colors. For example, as shown in Fig. 2, the scalogram image with 224 × 224 pixel resolution ( Fig. 2-a) have 150, 528 numbers ( Fig. 2-b) from 0 to 255 read by the computer.

Architecture overview
Artificial neural networks (ANNs) (Schalkoff, 1997) are computing systems inspired by the biological neural networks, which produce output depending on the input and activation. The ANNs connected the output of certain neurons to the input of other neurons forming a directed and weighted graph through a learning process by modify the weights and activation functions. CNN is an extended architecture to the traditional ANN. Like ANN, CNN consists of the input layer, hidden layers and output layers. The significant difference is that CNN is appropriate to the pattern recognition within images while ANN is not practical for solving computational complexity of image data. CNNs consider the input image as a 3D matrix ( Fig. 2-b) and arrange neurons of each layer in three dimensions. A CNN continuously reduces the neurons along the width and height, increases the depth, and eventually outputs neurons as 1 × 1 × X size for classification.

Basic building components of CNN
A CNN consists of several convolutional (CONV) layers with activation function (ReLu) and pooling/subsampling layers (POOL) optionally followed by fully connected (FC) layers. CNNs (Simonyan & Zisserman, 2014) express a single score function: from the raw image pixels to classifications cores, and also have loss function (e.g. Softmax) on the last (fully-connected) layer.
In the convolution layers, each filter is convolved along the width and height of the previous layer, and outputs a 2D activation map of the filter that consists of neurons. The activation function (ReLU) is applied in each artificial neuron performing the max (0, x) on the input neurons. The convolution layers significantly reduce the complexity of operations by a locally connected mode which extracts local features; To limit the unbounded nature of ReLU, local normalization is performed by local response normalization (LRN) layers which conduct a mathematical operation on n × n area. If the input neuron is x i , the output neuron y i could be described as below: where: n, α and β are hyper-parameters. The pooling layer is used to reduce the neurons along the width and height, whereas the depth remains unchanged, aiming to reduce parameters, control over-fitting, and keep valid feature information. The convolution layers and pooling layers will be repeated a few times before they are connected to the fully connected layers (see Fig. 3).
Overfitting is something that needs to be controlled in all machine learning approaches. Dropout layers randomly put input neurons as zero with a certain rate. The previous research has identified that using dropout layers in the fully connected layers can reduce overfitting effectively.

Construction and reconstruction
The training of CNNs includes forward pass (construction) and backward pass (reconstruction). In the forward pass, the input training samples are used to calculate the loss based on loss functions. For the backward pass, the gradient of each parameter is calculated by the backpropagation (BP) methods (Schalkoff, 1997). Parameters are updated in the direction of a negative gradient. The iteration continues until the convergence of networks reach. In particular, stochastic gradient descent combined with momentum method (SGDM) is employed in this study (Qian, 1999). SGDM updates the parameters using a portion of the sample parameters at one time. The number of the samples in this portion is called batch. Updating parameter in one batch is named an iteration and updating the parameters of all training samples is called an epoch (Kerh et al., 2017). The parameter update strategy is described as below: where: W is the weight (bias is similar), D is the weight update, L is the loss function, and l is the learning rate.

Experimental scheme
A 6-strands, 36 wires each, of cable with 3 m of length and 77 mm of nominal diameter has been used and shown in Fig. 4 Two types of sensors are selected to cover an operating frequency range of 10-100 kHz (Casey et al., 1985). Four R3I AST sensors with resonant frequency, Ref V/(m/s) of 25 kHz and another four R6I AST sensors with resonant frequency, Ref V/(m/s) of 55 kHz were chosen for comparison. The filter frequency range of these two types of sensors are 10-40 kHz and 40-100 kHz, respectively for R3I AST and R6I AST sensors. Figure 4-a describes the sensor arrangement at the stay cable. R3I AST (S1/3/5/7) and R6I AST sensors (S2/4/6/8) are represented in white and black, respectively. These are the PAC model integral preamp sensors with an operating temperature range from − Fig. 3 Illustration of a typical conventional neural network 35°C to 75°C. Contact between sensors and cables was guaranteed by using couplant and insulation tape. The detection threshold is fixed at 35 dB for all sensors.
Other than the test machine, two LVDT (linear variable differential transformer) devices were attached to the cable to measure the elongation of the cable, represented as the red crosses in Fig. 4-a. The cable was pre-stressed to a level of 10% of the minimum breaking load (MBL) 4850kN. Axial tensile load was applied until failure in displacement control (0.175 mm/s loading rate).

Test results and discussion
The cable is not completely separated as represented in Fig. 6, where two different failure modes are observed. Considering the axial stress is governing in the wires, cup and cone failure is dominated caused by a reduction in cross-section. In addition, shear failure also appears due to the helical structure of the cable. The load-time curve of the cable is illustrated in Fig. 8. The measured breaking force is 5138kN and 5.9% higher compared to the MBL prescribed by the manufacturer, which is within the expected range.
The determination of approximate time of wire breaks is important for further fracture & noise signal identification. Verreet (Verreet, 2005) concluded that breaking strength of a cable due to the failure of a single wire is reduced locally and with less than 1% in cables with multiple wires, which is calculated as 0.34% on average from Fig. 7. The local reduction is calculated as around 17.5kN based on the measured breaking load (5138kN) in this case study. As detailed in the scale-up view of Fig. 8, each small load drop (approximately 17kN) can be regarded as the indication of the presence of a wire break in the cable. A new equilibrium within cables will be achieved after a wire break. The stress redistribution repeats until the remaining wires are not sufficient to overcome the external load. To verify the time identification method based on the load-time curve, a device which recorded the sound continuously was placed Drummond (Drummond et al., 2007) concluded that the energy parameter is the most effective parameters to discriminate wire breaks from other AE sources, such as the internal friction between individual cables. This is due to the energy parameter is related to the strain energy of dislocations, fractures, and crack propagation. The cable failed at the region between S3/4 and S5/6. Figure 9 presents the energy-time results (blue line) and load-time curve (orange line) of one R3I AST sensor (S7) and one R6I AST sensor (S8). For better illustration, the diagrams are scaled to the time scaling ranging from 950 to 1200 s, when the wire breaks occurred. The energy values from S7 are relatively higher than S8, while the R3I AST sensor missed 6 out of 16 wire breaks. It can be concluded that AE signals generated from fractures of cables are more distributed in a frequency range of 40-100 kHz. In addition, Sensor 8 capture the fracturerelated AE signals effectively as the farthest sensor. The same analytical results also can be observed from the AE signals recorded by S2, S4 and S6. Therefore, R6I AST sensors are more appropriate for wire breaks identification compared to R3I AST sensors within the 3 m cable.
After determining the occurrence of signals related to wire breaks, the typical signals and related time-frequency scalograms of fracture & noise AE signals obtained from the tests are shown in Fig. 10 and Fig. 11. The wave shape depends on the nature of source emission, enabling the identification and classification of AE sources. A signal from a fracture is of higher amplitude and shorter duration than the characteristics of signals caused by noise. Fracture related signals are with significant amplitude decay and wide frequency bandwidth over a short duration. Although the difference between two signals is obvious, the lack of quantitative principles to distinguish fracture & noise signals leads to the effort by using TL CNN methods.

Training and validating based on GoogLeNet
From the aforementioned comparison, the recorded signals from R6I AST sensors (S2/ 4/6/8) are used for further signal processing by CWT. Specifically, the signals from Sensor 2/4/6 are used as training data, while the signals from Sensor 8 are used as test Fig. 6 Failure mode of the stay cable Fig. 7 Local reduction of the cable's breaking strength due to single wire break (Verreet, 2005) Xin et al. Advances in Bridge Engineering (2020) 1:6 data. Considering relatively small data set of AE signals in using DL approach, TL CNNs (Krizhevsky et al., 2012) are used to classify the noise or fracture signals, see Fig. 12 .
The status-driven AE CNNs are implemented based on MatLab Software (Klee & Allen, 2018) and GoogLeNet neural networks (Szegedy et al., 2015). As is shown in Fig. 13, GoogLeNet has an increased depth including 144 layers and uses modulus (namely inception in Fig. 14) to connect convolution layers with convolution kernels to accelerate feature extractions at a different scale. The 1 × 1 convolutions kernel (in Fig. 14-b) in front of larger-sized convolution kernels and the pooling layer are added to reduce over-fitting. Global averaging pooling layers before the final fully connected layer are set to reduce the number of parameters.
Each layer in the network architecture can be considered as a filter. The earlier layers identify more common features of images, such as blobs, edges, and colors. The later layers focus on more specific features in order to differentiate categories. To retrain GoogLeNet for the binary AE signals identification, the last four layers of the network are removed. The first of the last four layers, 'pool5-drop_7x7_s1' is a dropout layer. A dropout layer randomly sets input elements to zero with a given probability. The dropout layer is used to help prevent over-fitting. The three remaining layers, 'loss3-classifier', 'prob', and 'output', contain information on how to combine the features that the network extracts into class probabilities and labels. The four new layers were added to the layer graph for binary AE signals classification: a dropout layer with a probability of 60% dropout, a fully connected layer, a Softmax layer, and a classification output layer (with labels "fracture" or "noise").
The size of input scalogram image is 224 × 224 pixel resolution. The AE signals obtained from Sensor 2# (1180 signals), Sensor 4# (1673 Signals) and Sensor 6# (921 signals), totally 3774 AE signals, are used for machine learning. Noted that 80% of data sets are randomly selected for training, and the remainder is for validation. The basic learning rate is assigned as 0.0001. The dropout rate is set as 0.6. In the LRN layer, the hyper-parameters are assigned to 5, 0.001 and 0.75 respectively. SDGM strategies are employed to train the CNN. The loss value is shown in every 10 iterations; validation is performed every 252 iterations; the maximum number of iterations is 5040. The training results are summarized in Fig. 15. The total training loss is minimized and tends to be stable under the iteration of 5040. Initially, the increase rate of training accuracy is very fast. Afterwards, the training accuracy becomes steady and the final average verification accuracy reaches 99.05%.

Testing of trained binary CNN
The trained binary CNN is tested by 1890 AE Signals from Sensor 8 (R6I AST sensors).
If the binary CNN system identifies each signal with the correct label with "fracture or noise", the accuracy of this test will be set as 1.0. Otherwise, the accuracy of this test will be set as 0.0. The accuracy of the binary CNN system is obtained by averaging all 1890 prediction assessment numbers. High accuracy with 99.53% is reached by classifying the other AE signals which are not used in the machine learning. The accuracy is approximately identical to the validation accuracy. It indicates that the recognition results could be used in the current fracture-driven AE monitoring signals identification and has promising potentials in the AE statusdriven monitoring.

Conclusions
A non-destructive method to assess the service status and evaluate the remaining fatigue behaviors of cables in large-span bridges is needed for bridge maintenance and management. As a pilot study, this paper propose a novel binary AE signal classification framework to identify the fracture within stay-cables, using a TL CNNs based on Goo-gLeNet. A static tensile loading test of a stay-cable was performed in the laboratory. The recorded AE signals were then transformed to the image data sets for training and validating by CWT scalograms. The main conclusions from the pilot study are as follows: (1) Wire fracture inside a stay-cable can be detected with energy parameters captured from recorded acoustic signals using approximate sensors. For the experiment performed in this research, the R6I-AST type of sensors (100% wire break detection) is more suitable to detect wire breaks inside a cable compared to the R3I-AST type of sensors (62.5% wire break detection).
(2) The status-driven acoustic emission (AE) monitoring Convolutional neural network (CNN) by combing wavelet analysis and transfer deep learning is a promising approach. CNN is successfully used to construct the relationship between AE signals' scalograms and cable status. The trained CNN will be used to   (Szegedy et al., 2015) AE monitoring convolutional neural network (CNN) will be trained based on AE signal scalograms and transfer deep learning. After that, the proposed status-driven AE CNN method will provide valuable information about the current status of cables in the servable bridges.