Autonomous end-to-end wireless monitoring system for railroad bridges

One of the most critical components of the US transportation system is railroads, accommodating transportation for 48% of the nation’s total modal tonnage. Despite such vital importance, more than half of the railroad bridges, an essential component of railroad infrastructure in maintaining the flow of the network, were built before 1920; as a result, bridges comprise one of the most fragile components of the railroad system. Current structural inspection practice does not ensure sufficient information for both short- and long-term condition assessment while keeping the operation cost low enough for mandatory annual inspection. In this paper, we document the development process of an autonomous, affordable system for monitoring railroad bridges using the wireless smart sensor (WSS) so that a complete end-to-end monitoring solution can provide relevant information directly from the bridges to the end-users. The system’s main contribution is to capture the train-crossing event efficiently and eliminate the need for a human-in-the-loop for remote data retrieval and post-processing. In the proposed system, an adaptive strategy combining an event-based and schedule-based framework is implemented. The wireless system addresses the challenges of remote data retrieval by integrating 4G-LTE functionality into the sensor network and completes the data pipeline with a cloud-based data management and visualization solution. This system is realized on hardware, software, and framework levels. To demonstrate the efficacy of this system, a full-scale monitoring campaign is reported. By overcoming the challenges of monitoring railroad bridges wirelessly and autonomously, this system is expected to be an essential tool for bridge engineers and decision-makers.


Introduction
Railroads are a critical component of US transportation and economy. On average, the railroads carry 48% of the nation's total modal tonnage while emitting the least amount of greenhouse gas compared to waterborne, truck, and air (O'Rourke et al. 2015;Preliminary Data, 2017). Since 1980, freight railroads have spent more than $685 billion to maintain a safe freight rail network. This investment peaked in 2015 at $30.3 billion and has been maintained at a high level of $25.1 billion in 2019 (Association of American Railroads 2020). US freight shipments are also projected to rise from 17.8 billion tons in 2017 to 25.5 billion tons in 2040 (Association of American Railroads 2020). Such a high increment will put more stress on the structural integrity of the railroad systems.
Unfortunately, more than half of the 100,000 railroad bridges, a crucial component of the railroad system, were built before 1920 (American Railway Engineer and Maintenance-of-way Association 2003). Thus, maintaining the aging bridge infrastructure in a state of good repair is becoming increasingly challenging. In the past, multiple instances occurred where the railroads system and society were profoundly impacted by railroad bridges in adverse conditions, such as the 1993 Big Bayou Canot Bridge collapse in Mobile, Alabama, due to a bridge strike (Garner and Huff 1997), the 1997 train derailment in Kingman, Arizona, due to erosion from flash flooding (Mayville et al. 1999), and inaccurate structural ratings with obvious structural deficiencies (Gunderson 2015). In these cases, the preventative practice of using an easy-to-deploy and scalable monitoring system to provide rapid structural condition assessments to bridge engineers and operators at a remote monitoring center could be a straightforward solution.
To address these problems, on September 13, 2010, the Federal Railroad Administration (FRA) instituted a mandatory management program for all bridges. Under this new regulation, all railroad bridges need to be structurally inspected and rated at least annually. To date, visual inspection has proven to be sufficient (Federal Railroad Administration 2010). However, in many structures, especially complex ones, in-depth understanding to detect structural deficiency can only be accurately attained from quantitative structural health monitoring (SHM). Such SHM systems have already been realized using wired sensor systems on many structures, such as bridges (Tsing Ma Bridge, Kap Shui Mun Bridge, Ting Kau Bridge (Ko et al. 1999;Wong 2004), Bill Emerson Memorial Bridge (Caicedo et al. 2002)), buildings (Millikan Library (Clinton 2006), One Rincon Hill Tower (Huang et al. 2012)), or other types of infrastructures (Mufti 2003;Ni et al. 2009). However, a major drawback is in the cost of cabling and installation, which could be as high as $5000 to $22,000 per installed sensing channel (Farrar 2001;Celebi 2002).
To overcome the barrier of high cost, researchers and engineers have considered wireless sensors, which make use of radio communication to eliminate the role of cable in the system. Since the late 1990s, several generations of wireless sensors have been designed and applied to monitor real structures, but very few were aimed at monitoring railroad bridges, and none was designed explicitly for such purpose. Thus, even the most advanced and widely accepted solution has not been well-matched for monitoring one of the most critical infrastructure components for the economy. The reasons for such limitations are understandable, as railroad bridges working conditions and requirements are distinct from other types of civil infrastructure as follows: 1) Unpredictable nature of train events: Information on the bridge operation under revenue service traffic is beneficial for bridge engineers (Tobias and Foutch 1997;Otter et al. 2012;; however, capturing such information is challenging for most of the current WSS system because the train can be either running hours ahead or behind the predefined schedule. If not designed to account for unpredictable event timing, a monitoring system cannot capture the precise and complete record for the duration of the train. 2) Limited energy: One of the most notable characteristics of wireless sensors is that they must operate with a limited energy budget, provided in the form of a battery (Lynch and Loh, 2006). Most of the popular answers to this issue nowadays are employ a duty-cycling strategy. This solution can address the energy constraint, but it comes with the inherent shortcoming of missing an event data recording. Getting away from this duty-cycling strategy, however, is impractical for a network of WSS to communicate in a timely and energy-efficient manner. An appropriate solution must be able to adapt to the duty-cycling strategy flexibly to address this drawback.
3) The high cost of data retrieval and post-processing: Given the capability to harvest information over a long duration, ultimately, any smart system needs to be able to analyze and make decisions at the site. However, sensors generally require the ability to transmit data regularly to the operator. Without this functionality, regular checking or data aggregation requires a person to be present periodically. The associated labor cost increases significantly in the bridges in remote areas, which applies to the majority of the railroad bridges. Moreover, once the data is aggregated to the server for further analysis and processing, it is not useful until it reaches the engineers or bridge owners in an intuitive format and timely manner, i.e., actionable information. This step of data aggregation, management, and postprocessing needs to be integrated with the data pipeline and considered part of the initial design process for best utilization.
Addressing these challenges is crucial to realizing a practical autonomous WSS monitoring system for railroad bridges. From the design perspective, challenges #1 and #2 are intertwined. The early practice focuses on solving only one side of the problem, emphasizing either: 1) a schedule rendezvous scheme, which allows low energy usage but disregards the unpredictable nature of the train-crossing events, leading to mostly useless data being collected and missing portions of the event, or 2) on-demand scheme, which requires an always-on sensor to acknowledge a train-crossing event, which results in a reliable train event recording mechanism but requires a substantial amount of energy to sustain the system. Recently, researchers also attempted to combine both approaches into one system. For example, Chebrolu et al. (2008), Popovic et al. (2016), Lynch et al. (2017), and Flanigan et al. (2020 implemented a system with the sensor nodes in a low-power state. They only wake up and start the data acquisition if, during wake-up cycles, they receive a command over the radio indicating a train crossing from a dedicated, always-on event detection node (head node, sentinel node, or geophone). Even though attempts to refine the early strategies are made, this design still has a risk of missing a significant portion of the train-crossing data pertaining to the duty-cycle component of the framework, which can be a potential source of uncertainty if any accident occurs.
Unlike this popular solution, Lédeczi et al. (2009) and Bischoff et al. (2009) equip each sensor node with an always-on wake-up sensor for event-detection purposes. This design direction saves energy, reduces delays from the wake-up radio, and benefits from the reliable local event-detection mechanism. However, at the core, both systems still rely on a duty-cycle to check for wake-up conditions. This duty-cycling aspect is manifested either by turning on the microcontroller periodically or by periodic sampling by the wake-up sensor. As a result, this kind of system still suffers from the shortcoming of the missing event when the microcontroller or wake-up sensor is inactive. Other researchers attempt to solve the issue by focusing on efficient energy usage and harvesting. ECOVIBE (Liu et al. 2018) integrates a passive event detection circuit into each sensor node. Even though this work proposes an exciting approach to harvest the railroad bridges vibration for triggering and sustaining the embedded monitoring system, the startup delay of more than 0.47 s is suboptimal. In some rare cases, the traincrossing event could not be detected due to the capacitor not being sufficiently charged. Other than vibration energy harvesting, one of the most popular solutions to sustain WSS for SHM is to harvest solar energy and store it in rechargeable batteries. The critical limitation of this approach is low capacity or efficiency in low temperatures and overcast weather. This problem is worsened as the desired SHM system also needs to overcome the previously mentioned energy-consumption issue associated with capturing random train crossing events. Therefore, despite multiple attempts to tackle the problem, a wireless system for railroad monitoring to address both challenges #1 and #2 comprehensively is not available.
On the other hand, the data retrieval problem mentioned in challenge #3 can be separated into two sub-challenges: sensor-to-server and server-to-user process, since each needs an independent solution. The cost of the sensor-to-server data retrieval process reduces significantly if its human-in-the-loop component is reduced. Some researchers proposed a system on a moving train for this task (Chebrolu et al. 2008;Chen et al. 2013). However, this concept is only economically feasible for a densely instrumented network of bridges, and therefore it requires an enormous initial cost. Another more straightforward practice is to have an on-site mini-computer with remote access functions (Jang et al. 2010;Kurata et al., 2011;O'Connor et al. 2016). Nevertheless, when scaling to an extensive network of railroad bridges, this technique introduces the cost for operating, maintaining, and securing the on-site computers, which in most cases, are not designed for continuous long-term operation. Therefore, a more effective data retrieval method that acts as a gateway from the sensor network to the remote researchers and engineers, while not incurring a high cost, is still not available in the literature or practice.
Besides, the primary goal of the server-to-user link is to manage and present data efficiently and intuitively. This step contains two major components. First, to manage the sensor data, researchers in the SHM have discussed the use of a file-based data management system (Farrar, 2001, Wong, 2004, which is adequate for small networks but requires a slow manual process and becomes intractable as the system expands. Thus, the researchers then approached the problem in an automated and software-assisted fashion, saving time managing, organizing, and querying the data by utilizing a database management system (DBMS) (Li et al. 2006;Smarsly et al. 2012). Second, to enable ubiquitous access to the data, following the emerging cloud computing paradigm, researchers working SHM system for other types of structures have successfully implemented the cloud infrastructure as scalable access to data visualization and analytic platforms (Fraser et al. 2010;Zhang et al. 2016;Jeong et al. 2018). Despite being available in general structural monitoring applications, these solutions are still not realized for railroad bridge monitoring.
Therefore, to address the aforementioned challenges, this paper presents a comprehensive WSS monitoring system specifically for railroad bridge monitoring. This system provisions on-demand low power operation, no human-in-the-loop remote data retrieval, and a data management and visualization platform. The system addresses the limitations regarding energy by relying on ultra-low-power sensor nodes and highly synchronized schedule-based communication. Furthermore, remote data retrieval is achieved by integrating a 4G LTE modem in a typical sensor node, reducing the development task, and simultaneously improving the scaling capability of the deployments significantly. Finally, the cloud-based web-interface allows the engineer to access the data in near real-time using any internet-enabled handheld device. A two-month field deployment results show that the proposed approach can perform as designed, capturing informative train-crossing data while facilitating rapid information retrieval.
2 Low-power hybrid WSS operation As discussed in Section 1, to mitigate the high cost of wired sensing technology, WSS was introduced as a ubiquitous sensing solution and a better understanding of structures. The main advantages to using WSS are due to 4 major features: 1) its intelligence capabilities of the onboard microprocessor, which can handle digital signal processing, Analog-to-Digital Converter (ADC) or frequency-to-code conversion, communication interface functions, and other condition-based decision-making functions, 2) small size due to increasing use of Micro-Electro-Mechanical Systems-based (MEMS) sensing components, 3) low cost due to mass production and multipurpose of MEMS-based components, and 4) wireless to make use of the existing protocol for radio-frequency data transmissions .
For monitoring civil infrastructure purposes, researchers have been working on WSS since the mid-1990s with one of the earliest efforts from Straser et al. (1999) with the WiMMS system. As a result, since the early-2000s, multiple generations of WSS became more popular in the SHM research community. Most of the early platforms, the Mote, MICA, and Intel iMote sensor platforms, have been developed into commercialized products but recently are no longer available (Hill and Culler 2002, Kling 2003, Crossbow 2004, Zhao and Guibas 2004, Kling et al. 2005, Jo et al. 2011). More recently, WSS has also been developed from the industry effort. Some of the available products are BeanDevice, Crossbow, Microstrain (Johnson et al. 2009, Beandevice 2020, LORD, 2020. Nowadays, in addition to basic sensing and data logging functions, the WSS is equipped with multiple innovative features such as harsh environment operation, real-time data visualization, over-the-air programming. However, to this day, some of the challenges critical to civil infrastructure monitoring using WSS remain either unsolved or not standardized in the new platforms. The remaining issues include low data resolution preventing ambient vibration sensing, inflexible power management schemes are inadequate for long-term full-scale deployments, incapable computational power resulting in data inundation, to name a few. More detailed discussions are provided in Spencer et al. (2017) and Fu et al. (2019). To facilitate large-scale applications, Spencer et al. (2017) developed the Xnode smart sensor platform to provide a system capable of high-fidelity sensing, reliable communication, and efficient power and data management. While not yet able to address all of the aforementioned issues, this platform provides the tools that the researchers need to overcome the remaining challenges. The remainder of this section provides an overview of this WSS platform and the recent enhancements adopted in this work.

WSS system overview
The Xnode ( Figure 1) is a next-generation WSS that builds off previous research experience and is specifically designed to meet the needs of SHM applications, with highresolution 24-bit data acquisition, a built-in tri-axial accelerometer, and three-channel strain bridge, long-range wireless communication (~1 km), and robust onboard processing capabilities. The Xnode is based on the Illinois Structural Health Monitoring Project (ISHMP) Services Toolsuite (ISHMP 2009), a software framework for continuous and robust civil infrastructure monitoring using WSSs. This software was developed under a collaborative effort between researchers in civil engineering and computer science at the University of Illinois at Urbana-Champaign.
The Xnode software employs a real-time operating system (FreeRTOS), which facilitates flexible real-time application development. The Xnode also retains the Serviceoriented architecture (SOA)-based middleware functionality of the Illinois SHM tool suite. This functionality provides all the fundamental tools for the network of WSS (e.g., synchronized sensing, reliable communication, time synchronization, power management) (Fu et al. 2016, Spencer et al. 2017). The Xnode wireless sensors have been proved to operate reliably and provide useful and accurate results for SHM, such as cable system identification, high-sensitivity sensing, large-area capacitive strain monitoring (Zhu et al. 2018, Veluthedath Shajihan et al. 2020).

On-demand and schedule-based enabled sensor node
An intelligent SHM system must be able to react to the structural excitation while following all predefined schedules. This requirement suggests an ideal sensor node that can capture railroad bridge vibration data under excitation (i.e., train-crossing events) immediately upon detecting the event and keeping up with all the functions at a predefined time with minimal energy wasted.
To address these issues, a system was developed using: 1) ADXL362: a low-cost, lowpowered accelerometer that features an ultralow-power, 3-axis MEMS accelerometer consumes less than 2 μA at a 100 Hz output data rate and 270 nA when in motiontriggered wake-up mode (Analog Devices 2016) and 2) DS3231m: MEMS-based realtime clock with a temperature compensated crystal oscillator for highly accurate timekeeping of ±5 ppm (±0.432 s/day) (Maxim Integrated, 2015).
Both hardware components contribute to the essential characteristics that grant the Xnode the means to actively react to high acceleration from train-crossing events while retaining the capability to follow a schedule rendezvous scheme. More importantly, both parts consume ultra-low power (365 uA) even when conducting highly accurate and critical functionalities of continuous measuring and triggering the system from low-power mode. More details regarding the hardware integration process are available in Fu et al. (2018).

Hybrid adaptive on-demand and schedule-based sensing scheme
Despite having a working system, a complete framework for rapid train-event capturing, efficient communication with low delay, and power consumption still needs to be proposed and actualized. Specifically, the desired system must prioritize capturing train-crossing data while following a predefined schedule for other tasks. However, strictly following the schedule in unnecessary situations can result in the wasteful use of energy. Thus, the monitoring system must be adaptive in treating the wake-up sources and in assessing whether to follow the next predefined task in the schedule. Therefore, a hybrid adaptive framework making use of the wake-up sensor ADXL362 and the real-time clock DS3231m is proposed. Figure 2 shows the flowchart of the framework. In short, the sensor can be woken up by either high acceleration detected by the ADXL362 or alarm from the DS3231m. Upon starting up, the node checks for the ADXL362 flag, which indicates if high acceleration is detected, then proceeds to start high-fidelity sensing or waits for a remote command from the gateway note accordingly. When going back to sleep mode, the sensor sets the clock to fire the alarm AL1 in T1 hours and the other alarm AL2 in T2 minutes if it has data to send to the gateway node. Hence, T2 represents the expected delay in data retrieval, while T1 represents the next time the sensor and gateway nodes communicate. There are two principal functions following the waking up of a sensor node.
On-demand high-fidelity sensing: From the ADXL362, a train-crossing event is detected, the activity interrupt fires, and the Xnode sensor node wakes up from ultra-lowpower mode and starts sampling high-resolution time-history signal using the onboard 24-bit ADC. This process stops once the inactivity interrupt of ADXL362 fires, signaling no activity has been detected for a preset amount of time. Then online or post-processing of the raw data could be accomplished using the onboard ARM processor. Both raw and processed data are stored in the SD card memory. This high-fidelity sensing function has the highest priority, meaning it can preempt any ongoing task (except for another sensing activity) to start sensing.
Schedule-based remote command: When the system is not in sensing mode, it follows a network synchronized adaptive duty cycling strategy using the DS3231m alarms. During the 5-s in active mode upon startup, the sensor waits for remote commands from the gateway node and follows up if a command is received.
Note that the duty cycle of this system can adapt to the data collected by the node. If no new datasets are waiting to be sent to the gateway node, the sensor node works in a much lower duty cycle by using only AL1 to maintain the presence in the network (0.14% duty cycle if T1 = 1 h). If a new dataset is collected, the node prioritizes low retrieval delay by additionally enabling AL2, which periodically fires until the SendFlag is cleared, meaning the gateway retrieves all new datasets. This strategy facilitates a low-duty cycle without scarifying low delay data retrieval.
To visualize the efficacy of the adaptive scheme, operation states with corresponding current consumption and expected running time are listed in Table 1 for both nonadaptive and adaptive strategies. The assumptions are no solar panel attached, 10,000 mAh battery with a compensation factor of 0.8 for reduced battery capacity due to environmental effects, and long-term usage with multiple charge and discharge cycles (Hashmi et al. 2018)). T1 is negligible for the estimation, so it is omitted. The only difference between the non-adaptive and adaptive schemes is the number of attempts to communicate with the gateway node. The adaptive scheme significantly reduces this value by checking if any meaning information is presented for transferring before executing the task. Figure 3 shows an estimation of the number of days that one sensor node can operate in a typical heavy traffic line (i.e., N event = 10). This estimation of service life is calculated as: where states are listed in Table 1, t state, and I state are the daily activity time and current consumption of the state, respectively. As a result, in the non-adaptive strategy, the sensor node wakes up every T2 minutes to attempt to send back data. In contrast, the adaptive scheme only attempts to transfer the data back N event times, and thus can Fig. 2 The hybrid adaptive framework for on-demand and schedule-rendezvous schemes on the sensor node extend the battery life for nearly one more month for the case of T2 = 1 min, or nearly two more weeks for a typically used case of T2 = 5 min. In deployments where the traffic is light (i.e., N event = 2), the improvement is more drastic, and battery life estimation on the adaptive scheme can be nearly 6 months. This result proves that the adaptive scheme can protect the sensor nodes from depleting its energy for radio communication due to the inflexibility of a predefined schedule.

4G LTE enabled gateway for remote data retrieval
The next element of the data-to-user pipeline is to extract the data from the fielddeployed sensor network and send them back to a remote data repository for storage and processing purpose. Due to the typical remote location nature of the railroad bridges, a long-range communication method for this data retrieval process is necessary. Therefore, the researchers make use of the readily available technology and infrastructure and utilize the cellular network. This solution removes the limit of communication range, which is critical in providing ubiquitous access to the data. The researchers decided to directly interface the WSS gateway node with an embedded cellular modem  Fig. 3 The battery life of a sensor node considering various values of T2 to avoid the energy consumption from any immediate device and to enhance scalability. Only a few researchers pursued this direction, but a clear framework to reduce the cost and energy consumption to be practical enough for large-scale, long-term deployments has not been provided (Harms et al. 2009, Al-Radaideh et al. 2015, Admassu et al. 2019). Therefore, this section presents the integration process of a 4G LTE modem to the WSS platform and a framework proposal of an energy-efficient framework to coordinate the communication between the gateway and sensor nodes to sustain the network, retrieve and upload data reliably over long-term deployments.

Hardware selection and design
To make use of the flexible hardware interfaces of the Xnode, the candidate options that allow wireless internet access for the Xnode is investigated. There are currently two defining trajectories to follow in developing a cellular network, which differ in bandwidth, cost, power consumption, and latency (Fig. 4).
Due to the upcoming closing of 3G infrastructure in the US (IEEE Communications Society 2019), only 4G LTE or more advanced cellular network options are considered for their reasonable speed and cost. A safe candidate should have sufficient bandwidth not to cause a communication bottleneck in any situation. The final option is to use the 4G LTE network as it does not compromise bandwidth, given the power consumption and cost are reasonable. The selected component is Sierra Wireless HL7588 LTE-CAT4 modem, which is commercially available off-the-shelf in the form of Skywire 4G LTE CAT 4 Embedded Modem (Nimbelink 2020). The modem enables 4G LTE connectivity from major network providers, 3G fallback, 50Mbps Upload speed and 150Mbps Download speed, and Firmware over-the-air reprogramming (Sierra Wireless 2020).

Hardware and software integration
One of the main decisions to simplify the design process is that the modem is not integrated directly into the PCB design of the Xnode, so an external adapter is used. The modem is combined with the sensor system using an adapter board, which essentially provides power, communication, and control pins. This module communicates with the Xnode based on AT commands via Universal Asynchronous Receiver/Transmitter serial port (UART) and 2 GPIO pins power control. Notably, this hardware integration is carried out on a typical Xnode, meaning that any sensor node can be converted into a gateway node with 4G LTE capability. As a result, this WSS network can be effortlessly scaled at the site, as any sensor node can take the role of a gateway node with a simple plug-in hardware modification. Figure 5 shows the final design of the modem connected to an Xnode equipped with ADXL362 and DS3231m on their radio board. As the gateway node is designed to consume more energy, a large polycrystalline solar panel with a higher power (10 W instead of 3 W) is used. Figure 6 presents a general flowchart of the software commands required for transmitting data from the Xnode. Each command proceeds with an "OK" response confirmation. For the current AT&T network, the modem uses the Access Point Name (APN) "phone." The port to be used with the Message Queuing Telemetry Transport (MQTT) protocol, to be discussed more extensively in section 4, is 1883.
Using access to the cellular network and the Internet, various tasks requiring precise timing (second-accuracy precision) and data uploading are realized. First, the network timekeeping task simply is to read the time provided by the network, adjust the clock of the gateway node and propagate the timestamp to all of the sensor nodes in the network by repeatedly broadcasting the data. This function aims to compensate for the drifting effect (+/− 5 ppm or 0.432 s/day of the real-time clock), keeping all the nodes in synchronization with each other. Second, data uploading application ideally runs immediately after the gateway detects that it has datasets to be uploaded, previously retrieved from the sensor nodes. During this task, the gateway checks for data sets stored in the SD card and identifies those not yet sent based on the name tag, prioritizing older collected sets. The gateway then follows its 4G LTE connection setting up steps and then uploading the binary form through the UART connection. After that, the gateway renames the name tag to mark the dataset having already been sent.

Remote data retrieval framework integration
So far, all hardware and software components are realized on the sensor node and gateway node. In order to make the network operate precisely, a systematic framework for the gateway is then proposed and realized. Similar to what has been done to the sensor node, this framework also is flexible to adjust to the schedule adaptively.
A detailed flowchart of the framework is shown in Fig. 7. The core concept is that the gateway functions as a coordinator and data aggregator. Once the gateway wakes up, it immediately sends out a message containing an application ID. The sensor nodes use the application ID to proceed.
The primary applications of the framework are: Data collection: This is the essential application of the gateway node. Therefore, the gateway follows a rigorous schedule, waking up every T2 minutes without any exception to request data from the sensor nodes. After obtaining the data from the sensors, the gateway sets the UploadFlag, resets the node, and proceeds to the data uploading task. Data upload: In most cases, this application runs immediately after the data collection, minimizing any retrieval delay. More details were discussed in section 2. Cellular signal quality can affect this task; thus, an exception is introduced: As the most crucial feature of the network is to adapt to any real working conditions, of which one of the most regular is inadequate signal quality. Without this exception, in insufficient cellular signal quality situations, if the UploadFlag is set, the gateway will repeatedly fail to upload the data to the repository until it runs out of battery. Thus, the exception allows the clear of the UploadFlag if data is unsuccessfully uploaded. The software then allows the flag to be artificially set (regardless of real data to be sent), and the data could be resumed uploading if the signal quality recovers. In this implementation, this artificial setting of the flag is combined with the network timekeeping application. Network timekeeping: This task functionality is to ensure all nodes run on the same clock, allowing efficient communication only in the scheduled allotted time window overlap between the sensor and the gateway nodes (5 s in this implementation). More details were provided in section 2. Network checkup: This application is to request conditions of all the sensor nodes in the network. Upon receiving the application ID from the gateway node, the sensor nodes reply with its voltage and current level readings. Similar to the data uploading, this task forwards the network conditions to the repository over the 4G LTE network.
Current consumption and expected daily activity time of each state are listed in Table 2 as follows: Based on this information and applying eq. (1), Table 3 shows an estimation of the operation time of the gateway node for a heavy traffic line with N event = 10. For the current deployments, a typical setting of T1 = 1 h and T2 = 5 min results in a gateway running at 2.15% duty cycle, lasting about 65.3 days on a single 10,000 mAh lithiumion battery (assuming no solar panel and a compensation factor of 0.8 (Hashmi et al. 2018)). In the applications where there is no urgency in retrieving data, or more retrieval delay is acceptable, the gateway can operate at 0.49% duty cycle and last up to 225 days. This estimation shows that the system is flexible enough to serve a wide range of purposes in railroad bridges monitoring, from critical bridges requiring close attention to bridges in generally acceptable conditions.

Cloud-based data retrieval and visualization framework
The final component of the data-to-user pipeline is to manage and visualize data at the front-end efficiently. A server with efficient data aggregation, management for timely storage and queries, and provides processed data to assist the engineers with decisionmaking can significantly contribute to both long-term and rapid assessment processes. This section provides details regarding setting up this data repository.
We employed a Linux-based Ubuntu 18.04.3 LTS ×64 Virtual Private Server (VPS) with 2 vCPUs, 4GB of RAM, and 80GB of hard disk to host this cloud system. Figure 8 provides the overall view of this server. This server actively waits for data from the sensor network through an MQTT data broker. Once the data is collected by this broker, another MQTT client, which subscribes to the topic of the data, processes and decodes the raw data and then stores them into respective databases for further analytic. Finally, the processed data is presented and ready to be queried at the front-end of the web interface. The following subsections present more details of each of the three components.

Data acquisition through TCP/IP and MQTT
MQTT is an extremely light-weight message protocol running over TCP/IP. This protocol is perfectly suitable for a low-power embedded system like Xnode by consuming low power and minimizing the code footprint. The MQTT protocol follows a publish/subscribe (pub/sub) scheme. This scheme contains clients that can publish (send) and subscribe (receive) to messages of one or multiple topics. All clients are connected to a message broker, which receives the messages from the publisher and distributes them to the subscribers according to their subscribed topics (MQTT 2020). In our application, the sensor network plays the role of the publisher, and the data repository is the subscriber. A data broker is programmed using the opensource MQTT implementation Mosquitto in C programming language (Light 2017). The subscriber is set up using the opensource MQTT implementation Eclipse Paho-MQTT in Python to make use of the scientific computing libraries for post-processing (Eclipse 2020). The publisher on the gateway node is implemented using the same Eclipse Paho-MQTT library written in C language, acting as an MQTT encoder for the sensor data before being published to the broker over TCP/IP using the 4G LTE modem.

Data augmentation and storage
To ensure proper performance for management and to serve time for all types of data collected from the monitoring system, in addition to the popular relational DBMS MySQL being implemented to handle non-sensor data, a separate time-series database (TSDB) is also required (DB-Engines 2020). A TSDB enables multiple distinguished properties in handling time-series data, including fast range queries, high write performance, data compression, scalability, and usability. Among the options, InfluxDB, an open-source schemaless database, is the most popular choice for TSDB, and thus, is chosen as our implementation solution for time-series data (Naqvi et al. 2017). Both database systems are developed to work as MQTT subscribers through a Python data parser. Any dataset obtained from the sensor network is distributed to the parser, which then separates into several groups: Original Data: General information of the network and sampling process. Processed Data: Data obtained from applying pre-processing steps to the raw data (e.g., sensor orientation, Global Positioning System (GPS) coordinates). Bridge Data: Information of the instrumented bridges. Sensor Data: Time-series measurement datato be stored into a separated influxDB database. API Data: As underlying operational and environmental effects are known to contribute to the performance of any infrastructure, having these types of information in the database is crucial for the evaluation and decision-making process (Farrar 2001). Therefore, information from other sources is requested through the Application Programming Interface (API) to augment the original set of data, providing more insight into the analysis process.
Another separated schema is also set up to manage network condition information. A Python script actively directs the node information and condition (voltage and current readings) from both the train-crossing data and the periodic network checkup. Figure 9 shows an example where the data parser, working as a subscriber to "Monitoring Data" and "Status Data" topics, distributes the collected information while augmenting it with environmental data requested from the API.

Web Interface data visualization
A web interface granting ubiquitous access to the data is crucial to support direct access to the data and any immediate online structural assessment results. This platform is achieved via a webserver hosted using the micro web framework Flash written in Python (Flask 2020). This webserver has direct access to the MySQL and InfluxDB databases by using the MySQL connector and InfluxDB library. Thus, users can interact with the data by querying on the web by selecting a row in the database table. The time record is presented in graph and map format, showing both time-history and location data in response to the queries. In addition to the monitoring data, network condition, containing voltage and current measurements, is also presented so that the engineers with access to the webserver can check for the last known state of the network. The webserver is hosted using Apache HTTP. Figure 10 and Fig. 11 show an example of the applications provided on the web interface.

Validation of the monitoring system
To demonstrate the efficacy of the monitoring system, we instrumented the sensors on nine timber trestle and two steel truss railroad bridges. For each timber trestle bridge, typically 2-3 sensors are installed on the tall piers where vibration is more noticeable. On the other hand, 6-8 sensors are installed on two sides of a typical steel truss bridge. To keep the 10,000 mAh lithium-ion battery charged, each sensor node is connected to a 3 W polycrystalline solar panel, and each gateway node uses a 10 W polycrystalline solar panel due to demand for the higher workload. The sensor network follows a star topology, where each sensor node communicates with only the gateway node in the network. A time synchronization strategy for event-triggered monitoring applications is implemented in the system following Fu et al. (2020). However, for this type of deployment, the engineers are more interested in the movement of some individual bridge piers, sensing, and data synchronization methods are not utilized. Traffic frequency and cellular signal quality are the two main criteria for choosing the instrumented bridges. Bridges with a wide range of traffic frequencies are instrumented to evaluate the adaptability and flexibility of the system. Also, to make use of the low delay data acquisition capability, only locations with adequate signal quality are considered by the researchers (based on network coverage map and preliminary assessments).
On average, while following all safety guidelines from the railroads, the installation process for one wireless node on a timber bridge took 10 min for pre-deployment sensor checking, mounting plates installation, and sensor node installation. The same Fig. 10 Direct data-to-user interaction on the web interface (The current User Interface (UI) is a research prototype for an ongoing work of a more streamlined and feature-rich version) Fig. 11 Network condition visualized on the web page (The current UI is a research prototype for an ongoing work of a more streamlined and feature-rich version) process for steel bridges only took 5 min because the sensors were mounted directly on the steel members, so no mounting plate was required.
The validation monitoring campaign was scheduled into four deployments during September 2018 -December 2019. The goal of the validation campaign was to verify the system reliability, data quality, and framework robustness. The following subsections focus on one bridge monitoring deployment that is representative of the whole validation campaign.

Instrumented bridge
The bridge is owned by Canadian Railway, located in Marion, Illinois, USA. The highest pier is 29 ft., and the main span of the bridge is 130.67 ft. This campaign lasted 2 months, from December 2018 to January 2019. From the preliminary test results, the system is then designed so that the sensor nodes wake up from the acceleration threshold of 80 mg, and the data is sent back to the webserver no later than 5 min (which means T2 equals 5 min). Figure 12 shows a photo of the instrumented bridge, with two sensor nodes shown. Solar panels are installed at a tilt angle to avoid snow and dirt accumulation.

Data collecting and post-processing goals
From our communication with the bridge engineers, as well as from the literature, one of the most useful metrics that the bridge monitoring system could provide is dynamic displacement and assessment of such measurement . To achieve that goal, the sensors are set up such that the collected 1 kHz acceleration is decimated down to 100 Hz to reduce computation time. Due to the nature of the campaign-type deployments, the sensors' exact orientations are not maintained from one deployment to the next nor carefully measured at the site. Thus, an orientation compensation technique is then used to adjust the local sensor coordinates to the global bridge coordinates to obtain the correct measurements (Cho et al., 2014). The 100 Hz orientation-adjusted acceleration measurement is then filtered to transform into dynamic displacement, following the Finite Impulse Response (FIR) Filter-based reference-free displacement estimation algorithm developed by Gomez et al. (2018). Both decimation and filtering processes are performed in real-time onboard during sampling, utilizing the CMSIS DSP software library of the Arm Cortex-M4f (Keil.com, 2017). The acceleration data and dynamic displacement estimations are then sent to the gateway node and then to the server in less than 5 min after Fig. 12 Photo of the instrumented bridge the train-crossing event. For this deployment, acceleration data is used for future research purposes. The system also can save raw measurement data in the local nodes and only transfer processed data and higher-level assessment to the end-user.

Monitoring system reliability assessment
As a preliminary assessment, the researchers need to understand the reliability of the system in field-deployed operations. The main focuses are assessing the train-crossing event detection, energy usage of the sensor and gateway nodes, and delay of the data retrieval process. To assess this performance aspect, voltage reading is extracted from the data sets, containing acceleration measurements, timestamp, and node condition (voltage and current reading). Weather information to make sense of the readings is included later by the data augmentation process.
The SHM system recorded 944 data sets, resulting in meaningful data for 419 traincrossing events. The system maintained above 3.5 V readings, which indicate working conditions. Furthermore, an average reading of 90% battery charged (3.82 V on average and 4.0 V at fully charged) was recorded, proving efficient energy use. Figure 13 shows the battery voltage of on typical sensor node for deployment in the winter weather condition. The temperature is shown in Fahrenheit degree, and the sunlight level represents how much sunlight the solar panels exposed to (higher sunlight level (0.75-1) means that the solar panels exposed to more sunlight and thus, harvested more solar energy, and lower level (0-0.25) means that virtually no solar energy was harvested). Toward the end of the deployment period (day 42 until day 60), limited sunlight Fig. 13 Voltage reading during the 2 months of winter deployment combined with sub-freezing temperature prevented battery recharging, resulting in a significant drop in the voltage reading.
Because one of the highest priorities of the system is rapid retrieval and access to the data, timestamps of both the event and data upload are uploaded for the evaluation. 81.51% of the data is retrieved within the first round of communication (in less than 5 min). The reason for the remaining nearly 20% of the cases not retrieved on time is due to unreliable radio communication. However, 99.58% of the cases successfully retrieved the data within the second round of communication (Fig. 14).

Monitoring result
Records of pier cap acceleration and subsequently estimated displacement are obtained (Fig. 15) from the WSS. For structural condition preliminary assessment, control chart analysis using Statistical Process Control (SPC) is applied to the peak dynamic displacement measurement. This process aims to provide insight into the overall structural performance for the researchers and engineers in an intuitive format.
Assuming that the process being monitored follows a normal distribution behavior, the variation of the mean and variance of the process can indicate a change in the structural condition (Sohn et al. 2000). Control chart analysis is one of the most used SPC techniques used in an automatic system. The general assumption is that when the system shows anomaly, the mean and variance of the monitoring also features experience change. The X-bar control chart is employed in this study to monitor the change in the selected performance feature based on the inconsistency between the training and testing data sets. Based on the centerline (CL) mean value X, process standard deviation σ, upper and lower control limits (UCL and LCL) of the SPC are defined as: Fig. 14 Histogram of data retrieval delay In this study, N is selected as 3, corresponding to a 99.7% confidence interval between the two control limits. Data from day 1 to 10 are used for training, and the rest is used for validation. Once the expected value and confidence interval is established, tests to indicate outliers are used as triggering method to detect the anomaly. Various types of tests have been proposed for triggering conditions (Nelson 1984). In this study, data points lying outside of the three-sigma control limits are used to indicate an alarming structural change. To be more specific, when the maximum lateral dynamic displacement shows significant deviation from the expected value of the first 10 days by more than three-sigma, the condition of the bridge should be inspected more closely. This preliminary test objective is to detect any potentially harmful structural behavior while limiting excessive false-positive triggering alarms.
From the control chart analysis result in Fig. 16, the displacement is shown to be stable within the control limits over the 2 months of deployment, signaling no significant and noticeable change to the structural condition. From the record of annual bridge assessment, this bridge is shown to be in good condition and requires no urgent replacement or maintenance, which agrees with the preliminary assessment from the system. As structural assessment and data analytics are not the main focuses of this paper, the researchers only use this result as a very first step to validate the performance of the autonomous end-to-end monitoring system. Future work will present the implementation of more advanced data analysis methods for structural condition assessment.

Conclusions
In this paper, a WSS system is proposed to overcome the existing challenges of SHM of railroad bridges, including 1) short and random nature of train schedule, 2) limited energy, 3) impracticality of manual data retrieval, and post-processing. The researchers adopt a low-powered monitoring system capable of sudden event detecting and monitoring to address the first and second obstacles. The third challenge is then addressed by combining this system with a newly developed 4G LTE enable gateway node. A hybrid adaptive framework is then established to achieve low data retrieval delay and uploading. The framework ensures robust communication with regular synchronization, checkup, and reporting system, focusing on autonomous and low-power operations. Finally, a web server for data distribution, processing, and visualization is introduced. In sum, the newly proposed system with a wake-up sensor, a real-time clock, 4G LTE network access, and a flexible framework shows to operate autonomously in campaign-types validation Fig. 16 Control chart analysis of maximum displacements of three axes shows stable behavior over 60 days deployments successfully. The WSS SHM system provides reliable on-demand traincrossing, high-fidelity data recording, low delay data retrieval, accessible database access using a cloud-based platform for computing and management. With all these improvements, this system is expected to be an essential tool for bridge engineers and decisionmakers regarding railroad maintenance, repair, and replacement.