Classifying bridges for the risk of fire hazard via competitive machine learning

This study presents a machine learning (ML) approach to identify vulnerability of bridges to fire hazard. For developing this ML approach, data on a series of bridge fires was first collected and then analyzed through three algorithms; Random forest (RF), Support vector machine (SVM) and Generalize additive model (GAM), competing to yield the highest accuracy. As part of this analysis, 80 steel bridges and 38 concrete bridges were assessed. The outcome of this analysis shows that the ML based proposed approach can be effectively applied to arrive at the risk based classification of bridges from a fire hazard point of view. In addition, the developed ML algorithms are also capable of identifying the most critical features that govern bridges vulnerability to fire hazard. In parallel, this study showcases the potential of integrating ML into structural engineering applications as a supporting tool for analysis (i.e. in lieu of experimental tests, advanced simulations, and analytical approaches). This work emphasizes the need to compile data on bridge fires from around the world into a centralized and open source database to accelerate the integration of ML in to fire hazard evaluation.


Introduction
Bridges are strategic structures that facilitate transportation and supply chain operations. As such, bridges are to be designed to withstand normal and extreme load conditions. However, in current practice, bridge design is carried out to mitigate most loading conditions (including wind, and earthquakes), with the exception of fire hazard (AASHTO LRFD, 2017). From this perspective, there only exists a few general guidelines aimed to limit the vulnerability of bridges to fire hazard in the National Fire Protection Association (NFPA) Report 502 (NFPA, 2017). It should be stressed that even NFPA guidelines are general and qualitative in nature and are only applicable to bridges with spans greater than 300 m. As one can see, such bridges constitute only a small percentage of the total number of bridges in a given region.
Unlike building fires, which comprises of burning of cellulose materials, bridge fires are often trigged by burning of hydrocarbon fuels and hence are shown to rapidly reach temperatures exceeding 1000°C within a short period of time . Similarly, while structural systems in buildings are often insulated and protected by active fire means (i.e. sprinklers), load bearing structural systems in bridges continue to be designed with our without any active or passive fire protection features. Given the above, and noting that the bridges are often away from nearest fire department locations (to fight the fires), continuous exposure to the surrounding environment and their extended service life, implies that bridges become vulnerable to extreme events, especially fire hazard. Recent incidents have shown that fires on bridges can lead to the development of significant thermally-induced forces on connections, and result in collapse (NTSB, 2017;Eisel et al., 2007). Fortunately, bridge fires often extinguish quickly due to burning out of limited fuel present or firefighting activities. However, such incidents although may not cause collapse, they can still induce large damage to load bearing members, which can result in closure of the bridges for weeks for repair and retrofitting (Garlock et al., 2012).
The above discussion infers that it is of highest importance to properly identify bridges from a fire hazard perspective to enable authorities from taking appropriate actions at the design stage itself to improve the resilience of such bridges. However, given the large number of bridges (e.g. + 660,000 and + 878,000 operational bridges in the US and China, respectively) infers that identifying vulnerable bridges to fire can be challenging (Statista, 2020;LTBP, 2020). It is due to such challenges that little research has been directed towards identifying fire-vulnerable bridges (Giuliani et al., 2012;Quiel et al., 2015;Aziz & Kodur, 2013;Kodur et al., 2017;Kodur & Naser, 2019;Alos-Moya et al., 2017;Ma et al., 2019). Of the existing limited works, the majority applied similar methods to that adopted in identifying vulnerable bridges to wind and seismic hazard (i.e. importance factors) (Naser & Kodur, 2015a). Other works applied statistical and fragility analysis methods to arrive at a methodology to enable assessment of bridges to fire (Gidaris et al., 2017).
Other than the above noted traditional methods, machine learning (ML) continues to present itself as novel and effective approach to tackle data-oriented problems in the civil engineering domain (Naser, 2018;Naser, 2019a;Gandomi et al., 2011;Solhmirzaei et al., 2020;Taffese & Sistonen, 2017;Hodges et al., 2019). For example, ML methods have been proven effective when applied to a variety of problems within the domain of bridge design and maintenance including; bridge assessment , seismic analysis of bridges (Mangalathu & Jeon, 2019), maintenance of bridges (Okazaki et al., 2020), and traffic path planning (Zuo et al., 2019) etc. However, such ML approaches is yet to be applied into classifying bridges to fire hazard.
This paper aims to bridge the above knowledge gap by applying ML to identify and classify bridges according to their vulnerability to fire hazard. Three algorithms namely; Random forest (RF), Support vector machine (SVM) and Generalize additive model (GAM), are developed and applied to examine how various features extracted from a large set of bridges, traffic flow and fire incidents can influence vulnerability to fire. These algorithms are trained to analyzed 80 steel bridges and 38 concrete bridges in pursuit of learning hidden patterns responsible for bridges vulnerability to fire hazard. Overall, all algorithms performed well with an accuracy of about 70%, a classification equivalence time of about 100 bridges per minute. Due to the unique learning nature of ML algorithms, the developed algorithms herein can be further finetuned with the addition of new bridge-related features and fire incidents. A key take home message is that ML can be a valuable tool to automatically analyze large bridge populations to identify those of high vulnerability to fire.

Development of bridge fire database
To effectively develop a ML-based approach, a good set of fire incidents that occurred on bridges is needed. Thus, a comprehensive literature review was first carried out to document notable bridge fire incidents. This review documented key and common factors that governing the response of bridges to fire as documented by the departments of transportation (DoTs) reports and from consultation with practicing engineers (Eisel et al., 2007;Quiel et al., 2015;NYDOT, 2008;Bocchini et al., 2014;Qiang et al., 2009;Davis & Tremel, 2008;Guthrie et al., 2009;Culliton, 2018). These documented factors include; bridges (structural) features, traffic flow patterns, and fire characteristics. Overall, this survey led to collecting data on fire incidents in 118 bridges (see Fig. 1). While this study considered three main features, other features can also be included once information on such features are reliably obtained or collected. It is our intention to present a general approach to enable adoption of ML into this domain and we invite interested readers to extend and update the presented database and approach as shown in earlier works (Naser & Kodur, 2015b).

Bridge features
The identified physical features that are govern the vulnerability of bridges against fire hazard include: structural system and construction materials used in load bearing elements, span and age of the bridge. Figure 1 shows that the compiled database features from 80 steel bridges and 38 concrete bridges that experienced fire incidents over the last three decades. The same figure also shows that out of these bridges 17 were box-based, 15 were cable-based, 65 were girder-type bridges, and 22 truss-like bridges. In terms of bridge span, the average span of all compiled bridges collected is 117 m. The full distribution of spans in all bridges are shown in Fig. 1. Finally, the average age of collected bridges is 45 years which coincides with that reported by US DOTs (LTBP, 2020).

Traffic features
Within traffic features, both geographical significance as well as number of lanes on the bridge were included as they represent the significance of the bridge to the region, expected traffic flow and availability of alternative routesas these factors indirectly imply the adverse consequences of the loss of functionality of the bridge due to fire. The geographical significance of bridges is grouped under three classes: rural, suburban and urban as noted in a previous work by the authors (Kodur & Naser, 2013). Figure 1 shows that there are 32 rural bridges and 43 sub-urban and urban bridges. In lieu of geographical significance, the distribution of bridges' number of lanes wherein in 50% contain 1-3 lanes and the other half contained 4-11 lanes.

Fire features
Herein, two features are identified to be of importance: possible fuel type to be involved in burning and position of fire breakout on the bridge. In the first feature, fuel type varied between gasoline/diesel, or hydrocarbon fuels, and other types of flammables (i.e. chemicals, wildfires etc.)see Fig. 1. For simplicity, three positions for fire break scenarios out were considered; in the vicinity of the bridge, above the bridge and under the bridge with incidents of 4, 56 and 58 bridge fires belonging to the aforenoted positions.

Damage magnitude
Contingent upon the severity of fire, the magnitude of damage the bridge experiences and any possible traffic stress to the surrounding transportation network can vary. On one hand, if a bridge does not experience significant structural damage from fire, then this bridge can be re-opened for traffic in short order. On the other hand, moderate to major damage to structural members of a bridge require proper inspection and repair, which in turn necessitates closure of bridge from safety consideration.. To enable such inspection and timely repairs, through traffic need to be reduced on the route and have to be detoured. Thus, there are two classes of damage that are to be considered herein; no damage to bridge structure (does not necessitate full shut down), and damage (necessitates shut down). Overall, 69 of the surveyed bridges experienced nil to minor damage, 66 underwent major damage (including collapse).

Description of machine learning approach
This section describes the general description and steps associated with the development of the ML approach and associated ML algorithms.

General approach
For the application of ML approach to a problem, a user must select a series of ML algorithms. The selection process can be purely be arbitrary or can be taken as a result of a sensitivity analysis (Barber, 2012). Oftentimes, the use of 1 ML algorithm to understand a phenomenon can be sufficient. However, recent experience has shown that this practice might lead to biased ML-based solutions in some situations and also in few other instances it may not produce a near-optimal solution in a timely manner. With this consideration, this study explores the use of multiple algorithms to harnesses the advantages of multi-algorithm search. In this multiple algorithms approach, ML algorithms can search in a competitive arrangement to look for best possible solutions (which from the view of this study refers to accurately classifying bridge for the risk of fire hazard). Once a solution is identified by each algorithm, a series of fitness metrics are applied to identify the fittest solution for a problem (Naser & Alavi, 2020). Following this procedure, the identified solution is not only vetted across different search mechanisms but is also vetted through different ML analysis stages (see Fig. 2). Once a ML algorithm is properly validated, then this algorithm can be ready for deployment to assess new bridges for fire hazard. With the addition of new bridge fires and information, the algorithm can be re-tuned to improve its prediction capability.
Once the vulnerable bridges are identified for fire risk, then these bridges can be incorporated with needed measures to enhance fire safety and minimize their vulnerability to fire risk. Such measure include provision of fire insulation to steel members or put in measures to minimize the occurrence of fire in the vicinity of the bridge (e.g. no storage of flammable materials under bridges). Other solutions can also be adopted as noted in recent works (Naser, 2019b).

Random forest (RF)
Random forest (RF) is an algorithm that capitalizes on principles of ensemble learning (in which a tree-like algorithm is applied multiple times with different types of algorithms that are joined together to form a more powerful prediction model that applies majority voting principle)see Fig. 3. RF can be used in classification and is defined as a nonparametric classifier (i.e. does not require assumptions to be made on the form of relationship between the predictors and the response variable). In a classification problem, the majority voting method is used to arrive at the final output of RF analysis. A typical formulation of RF is presented herein: where, J is the number of trees in the forest, k represents a feature in the observation, K is the total number of features, c full is the average of the entire dataset (initial node).

Support vector machine (SVM)
Support vector machine (SVM) is an algorithm often applied in classification problem. SVM arrives at solutions through obtaining a separating hyperplane among classes (see Fig. 4). The SVM algorithm can be illustrated by considering a training data set T = {(x i , y i ), i = 1, 2, …, N}. This data set consists of an N number of m-dimensional features vectors x i and their corresponding labels y i ∈ {−1, 1}. SVM aims to find the separating boundary between two or more classes. This is done through maximizing the margin between the decision hyperplane and the data set, while minimizing the misclassification. The decision/separating hyperplane is defined as where w represents the weight vector defining the direction of the separating boundary, whereas b denotes the bias. The decision function is defined as where sgnðαÞ ¼ 1; α ≥0 − 1; α < 0 SVM algorithm aims to maximize the margin through minimizing ||w||, which results in the following constrained optimization problem where τ 1 (.), ‖.‖ 2 , and ξ i denote the objective function, L 2 -norm, and slack variable, respectively. When the data is linearly inseparable, SVM offers an alternative solution for classification. To this end, SVM employs a kernel trick projecting the data into a higher dimensional feature space to make data divisible, as illustrated in Fig. 4 (Han et al., 2012). The kernel function, in fact, defines the nonlinear mapping from the input space into a high dimensional feature space.

Generalize additive model (GAM)
Generalize additive model (GAM) is a nonparametric extension of Generalize linear model. GAM can be useful in scenarios were a user may not have a priori reason or preference for choosing a particular algorithm or response function (such as linear, quadratic, etc.). GAM separates features into knots, and then attempts to fit polynomial functions between such knots. In GAM, the model fit follows a deviance/likelihood, and hence fitted models are directly comparable using likelihood techniques. In GLM, the outcome class (Y) of a phenomenon is assumed to be a linear combination of the coefficients (β) and features (x 1 , …, x n ) as seen in Eq. 5.

Machine learning model development and validation
The above discussed algorithms in Section 3 are applied to analyze the compiled database shown in Section 2. For a start, the compiled database was randomly arranged to minimize biasness that might arise from a particular feature or fire incident. After that, the database was split into a training set (80%) and testing and validation set (20%) to be used to evaluate the performance (i.e. fitness) of the machine learning techniques once the training process is complete (Hasni et al., 2018). In addition, a k-fold cross validation was also applied. In this technique, the database is further divided into k subsets. Each of subset is then kept aside (in holdout), while the shuffling of data is repeated k times, such that each time, one of the k subsets is used as the test set/ validation set and the other k-1 subsets are put together to form a training set. This method significantly reduces bias and variance as well as limits overfitting of the algorithms. A fold of k = 5 is used herein. In all cases, the results of the ML analysis is examined via the following performance metrics:

Area under the ROC curve (AUC)
This metric Measures the two-dimensional area underneath the entire Receiver Operating Characteristic (ROC) curve with best performance reaching 100%, such that: where, w = width, and h and h' = heights of the sides of a trapezoid histogram.  Table 1 shows the confusion matrix and fitness metrics for all algorithms. It is worth noting that the overall accuracy for these techniques is quite promising. All of the aforementioned metrics reveal the accuracy of the RF algorithm as opposed to SVM or GAM. Overall, the listed metrics shows that the proposed ML approach can be used to classify fire damage in bridges with confidence.
In addition, a sensitivity analysis was carried out on the proposed ML approach to identify the relative impact of each feature within on the overall vulnerability of bridges. In Table 2, feature impact refers to the likelihood that increasing a specific feature leads to an increase to the outcome (i.e. if a feature has an impact of 80%, then 80% of the time an increase in this specific feature would lead to an increase the bridge undergoing damage). Table 2 shows that the main two features with the highest sensitivity (i.e. impact) are the fuel type involved in the fire incident and the span of the (primarily girders bridge, with age of the bridge and geographical significance coming in next. The ML analysis was finally used to examine the association between the influencing features and explore the degree of dependence between the selected features. Table 3 lists such association. It is clear that the association between features is minimal (less than 0.3) which implies that the independence of these features upon each other. This independence further our confidence in this analysis as the selected features were also indirectly related to each other.
The above ML approach can now be deployed by authorities to identify vulnerable bridges to fire hazard. This outcome of this example shows that predictions from RF, SVM and GAM algorithms may not always agree with actual incident given the above accuracy metrics which falls short of 100%. Still, the proposed ML approach continues to be feasible as it can seemingly be extended beyond the above three algorithms.

Practical applications
The ML revolution is being implemented in parallel areas right now (Hamet & Tremblay, 2017;Litman, 2014) and it is of merit to the bridge community to start planting seeds to allow the use of ML in bridge applications. In addition, recent engineering graduates are becoming very familiar with ML; partly to their engagement with modern technologies. The same students will be leading our area in the coming 10 years or so and hence current works can start seeding for wide implementation of ML in the near future. For example, the proposed approach herein can be used to evaluate the fire resistance of bridge components (i.e. girders and piers). To extend the applicability of the proposed approach to bridge fire resistance design, a larger dataset is to be compiled which will require tremendous effort as data on bridge fires are not easily accessible. We hope future works will be able to compile such a database to allow developing improved ML algorithms. The use of ML will allow engineers from drawing conclusions on the vulnerability of a given bridge towards certain extreme events by comparing its key features to that of the general population of the bridges that failed under various conditions. The use of ML will help engineers to identify such bridges and associated events that may lead to failure with ease. More specifically, properly developed ML tools can be trained to identify what are the combination of factors that are associated with bridge failures (whether fire or other hazards). Based on the identified pattens, bridges with similar patterns will be flagged under a certain criterion, and DOT engineers can examine such bridges in more details. This will not only reduce the amount of inspection work to be carried by DOT engineers  to thoroughly analyze every single bridge, but will also provide a new set of eyes to the same engineers to examine bridges from a new perspective.

Conclusions
Based on the findings of this work, the following conclusions can be drawn.
Bridge fire incidents continue to rise around the world due to urbanization and increasing transportation of fuels and hazardous chemicals. However, there is currently lack of methodologies for identifying bridges for the risk of fire hazard and also guidelines for designing bridges for fire safety. ML can be successfully applied to develop bridge assessment tools that can identify vulnerability of a bridge to fire hazard. These ML based techniques can be specifically tailored to account for varying features, such as those related to physical, traffic, and fire characteristics, in evaluating the risk of fire hazard on a specific bridge. The proposed ML approaches can be improved further with the compilation of reliable data and observations of fire incidents on bridges and also the method can be extended to assess vulnerability of tunnels to fire hazard.
Abbreviations GAM: Generalize additive model; ML: Machine learning; NFPA: National Fire Protection Association; RF: Random forest; SVM: Support vector machine