Skip to main content

Review of data quality indicators and metrics, and suggestions for indicators and metrics for structural health monitoring


Structural Health Monitoring (SHM) systems have been extensively implemented to deliver data support and safeguard structural safety in structural integrity management context. SHM relies on data that can be noisy in large amounts or scarce. Little work has been done on SHM data quality (DQ). Therefore, this article suggests SHM DQ indicators and recommends deterministic and probabilistic SHM DQ metrics to address uncertainties. This will allow better decision-making for structural integrity management.

Therefore, first, the literature on DQ indicators and measures is thoroughly examined. Second, and for the first time, necessary SHM DQ indicators are identified, and their definitions are tailored.

Then SHM deterministic simplified DQ metrics are suggested, and more essentially probabilistic metrics are offered to address the embedded uncertainties and to account for the data flow.

A generic example of a bridge with permanent and occasional monitoring systems is provided. It helps to better understand the influence of SHM data flow on the choice of DQ metrics and allocated probability distribution functions. Finally, a real case example is provided to test the feasibility of the suggested method within a realistic context.

1 Introduction

As information technologies emerge, large amounts of data are generated and need to be managed - collected, transmitted, and processed - to support decision-making. Data quality (DQ) assurance plays a crucial role in guaranteeing that the decision is cost-effective. Conversely, poor DQ may lead to decisions that negatively impact the performance of the managed system and the optimal management of available resources.

Data and information are the pillars of SHM. The process is intended to transform data collected on a structure to support the decision-making process for the selection of the optimal management actions across its lifecycle (e.g., maintenance or emergency management). The optimal action is usually intended as cost-efficiency in terms of safety, serviceability, and sustainability.

The first step in the DQ assurance in general, and SHM data in this paper, is the definition of the data attributes (i.e., “indicators” or “dimensions”) that define their “quality”. Once the attributes are identified, their impact on the decision-making process needs to be verified and possible ways to improve the quality of data and information can be determined.

Several authors have been working on DQ and data management in financial and economic environments for some time. Very few authors dealt with the SHM DQ and, almost all focused on assessing the DQ of sensors. Studies of financial and economic data, detailed in Sect. 2, suggest indicators that describe each attribute of DQ, as well as metrics to quantify these indicators. Most of the proposed measures are defined deterministically, except for Heinrich and Klier (2015) and Heinrich, Hristova, et al., (2018) which provide probabilistic metrics for timeliness and consistency respectively (Sect. 2.2). One reason is that the focus of these studies has been on deterministic data that are not (or barely) affected by uncertainty.

This paper focuses on SHM data and information where a probabilistic approach is more appropriate to account for all uncertainties involved in managing them. As in Pipino et al. (2002), the terms data and information are considered interchangeable and not exclusive. For example, the assessed SHM set contains simple data (i.e., date of the recording, recording of the ambient noise, etc.) and information (i.e., fundamental frequencies, etc.).

In this article, for the first time, necessary quality indicators are identified, and definitions are tailored to the specific case of SHM data. Then simplified deterministic SHM metrics are suggested, and more essentially probabilistic metrics are proposed to account for uncertainties and dataflow.

For this purpose, Sect. 2 details the indicators proposed in the literature. Section 3 then adapts the definitions of the quality indicators identified for SHM data. Section 4 addresses the state of the art of metrics. Section 5 offers the DQ deterministic and probabilistic metrics which account for uncertainties and the data flow. Section 6 presents a generic example, then calculates the deterministic metrics, and recommends the probabilistic metrics for the specific case. Finally, Sect. 6 concludes.

2 Review of data quality indicators

Heink and Kowarik (2010) investigated the indicator term and noted that “it has a synonym for ‘indicans’, i.e., a measure or component from which conclusions on the phenomenon of interest (the indicandum) can be inferred. Indication here is the reflection of an indicandum by an indicator”. In this paper, the term indicator indicates a DQ attribute, and the term metric indicates a measure of this attribute that allows its quantification. Furthermore, the terms indicator and attribute are used equally. This is because it influences the decision-making process of the information user (e.g., the owner of a monitored bridge) and the information provider (e.g., the consulting company that provides the SHM service). In general, the more confident the user is in the data, the more likely they are to use it (and/or buy it) to support their decisions, and, consequently, the benefits of the data increase.

Since the seventies, many researchers have focused on describing DQ. Some general and preliminary ideas were put forward in Hoare (1975) and Chapple (1976). They briefly examine data reliability issues and then summarize some of the conceptual and methodological tools available to address them. However, the DQ indicators started to take shape, in the eighties Brodie (1980). One of the first studies on DQ indicators is (Ballou and Pazer 1985). The most prominent classifications of attributes were proposed in the nineties by Fox et al. (1994), Wang and Strong (1996), Redman (1996), Jarke et al. (1995), and Ballou et al. (1998). Fox et al. (1994) discussed the notion of data and detailed quality attributes such as currentness, age, and timeliness; completeness and duplication; consistency and integrity. Wang and Strong (1996) and Ballou et al. (1998) suggested an interesting classification to capture different DQ aspects. They used a two-stage survey to identify DQ attributes perceived by data consumers. This was done by first listing 179 attributes that capture the consumer perspective on DQ. The list was subsequently narrowed to 11 attributes by merging synonyms. The authors then clustered the 11 indicators into 4 categories of DQ: intrinsic, contextual, representational, accessibility. Intrinsic DQ includes attributes that can be assessed independently from the context (accuracy, believability, objectivity, reputation). Contextual DQ contains attributes that must be considered within the context of the task in hand (value-added, relevancy, timeliness, completeness, appropriate amount of data). Representational DQ embraces aspects related to the format and understandability of data (representational consistency, concise representation, interpretability, ease of understanding). Accessibility DQ describes the degree to which data is accessible but secure (accessibility, access security). This classification was detailed further in Batini and Scannapieco (2016) which organized the attributes into two main classes defined as “inherent” and “system dependent”. The first includes the intrinsic category, the second incorporates the other three categories defined by Wang and Strong (1996).

Subsequently, Pipino et al. (2002) introduced the concept of subjective and objective DQ assessment, proposing a method for combining them. The subjective assessment is based on subjective perceptions of data by individuals. Thus, if data users perceive subjectively that the data is poor, then this will influence their behavior (e.g., they will not buy and use the data). The objective assessment is based on the data set in question. It can be task-independent (i.e., applied to any data set, regardless of the tasks at hand) or task-dependent (i.e., developed in specific application contexts). They defined 16 DQ attributes and proposed functional forms to develop their metrics with a multidimensional approach to account for multiple DQ attributes.

The classification originally introduced by Wang and Strong (1996) and detailed further in Batini and Scannapieco (2006), was later refined by Färber et al. (2017). They include consistency and verifiability in the list and extend the accessibility category through the attribute license and interlinking. Besides, they followed the steps of Wang and Strong (1996) and used indicators such as relevancy, and grouped several attributes under a single indicator. For example, trustworthiness is introduced to englobe believability, objectivity, and reputation; completeness is assumed to include an appropriate amount of data and value-added. Finally, interoperability (i.e., concerning machine-readability) englobes interpretability (i.e., the extent to which data are in an appropriate language and units and data definitions are clear).

Heinrich et al. (2007b), Heinrich and Klier (2009), Heinrich, Klier, et al., (2018), and Heinrich, Hristova, et al., (2018) deal with the DQ impact on decision making. In this perspective, DQ indicators and metrics are important for several reasons. First, they can indicate to the decision-makers to what extent they can rely on the data they use to support their decisions. Second, they can support cost-efficient data management, i.e., data should be acquired only if its benefit outweighs the associated cost. And third, they can indicate how to improve the DQ. The quality attributes considered relevant in this framework by the authors are timeliness, completeness, reliability, correctness, and consistency. Heinrich et al. (2007b) focus on the attributes of correctness and timeliness, and Heinrich and Klier (2009) refine the timeliness definition to account for the availability of supplemental data.

Several authors such as Fox et al. (1994), Behkamal et al. (2014), and Fürber and Hepp (2010), introduce attribute duplication and defined it as the double presence of data that may jeopardize the decision-making process. In this paper, duplication is replaced by redundancy. It is defined by the availability of several datasets which can be used as a backup in the event of data losses. In SHM context, redundancy is a very important indicator, as it ensures that many data sets are available for the decision problem.

The studies examined above have been conducted on financial and economic sector data. Few authors have focused on monitoring data. Gitzel (2016), studied DQ in time series. The considered indicators are completeness, free-of-error, plausibility, and richness of information. Jianwen and Feng (2015), to automatically control the DQ in a wireless sensor network, consider the indicators’ timeliness, consistency, incompleteness, anomaly, and redundancy.

All indicators reviewed in this article are gathered in Table 1 and categorized according to the classification proposed by Wang and Strong (1996). Table 1 first column reports the indicator name. The second shows the definitions of the indicator as suggested in the reference given in the third column. The “extent to which” and the “degree to which” at the beginning of definitions were replaced by (-) and (_) respectively.

Table 1 Data quality indicators

3 Selection of SHM data quality indicators

Selecting the DQ attributes most relevant to the problem at hand can be performed in various ways. Wang and Strong (1996) specified that the approaches used in the literature to select DQ indicators may be classified as: (1) intuitive, (2) theoretical, and (3) empirical.

  1. 1)

    The intuitive approach consists of selecting the attributes specific to the case at hand and based on the “researchers’ experience or intuitive understanding about what attributes are important”. It is the most frequently used approach.

  2. 2)

    The empirical approach, followed by Wang and Strong (1996), relies on collecting DQ attributes from data consumers. This approach is rarely used.

  3. 3)

    The analytical approach, seldom used as well, focuses on how data can become deficient during the data manufacturing process. Wand and Wang (1996), used the ontological approach for the analytical approach and explained its basics in Wand and Weber (1990).

Lately, Rodríguez and Servigne (2013) focused on sensor data and suggested three data layers: acquisition, processing, and utilization layers. Then, for each layer, a different set of relevant DQ indicators is offered. Namely, the acquisition layer encloses accuracy, spatial precision, reliability, completeness, and communication reliability; the processing layer embeds consistency, currency, and volatility; and the utilization layer englobes timeliness, availability, and adequacy.

In this article, like most researchers, the intuitive approach is used to select the SHM DQ indicators.

By “intuitive”, we mean “subjective” i.e., subjectivity in the choice of the indicators and not the analysis that follows. It means that the indicators to be used in this study were chosen from the list of indicators in the literature in a subjective way, as per the demand of the SHM case and expert opinion (as noted before, this procedure was followed by most researchers). Other experts might choose, based on the needs of their study, to focus on different indicators or select some of the available ones from the list we offered. However, we have tried to preserve the most variety possible.

Moreover, we note that this subjective selection is limited to the choice of indicators and does not extend to the assessment of the indicators’ metrics (in the following sections), and to extracting the information from the data, and structural analysis modeling.

Finally, we note that those indicators relate to both the sensor data and the extracted information as well. However, we are studying the quality of the data and information and not the quality of the structural analysis model. Other researchers can apply the same method for other measurement tests in different case studies or use similar approaches to study the quality of the structural analysis model.

After selecting subjectively the SHM DQ indicators, those indicators are then classified for the layers suggested by Rodríguez and Servigne (2013). Therefore, to select our indicators, the following steps are done: (1) The state of the art is analysed; (2) the indicators are clustered; (3) the indicators are classified.

  1. 1)

    Analysis of the state-of-the-art, SoA

    Analysis of the SoA of DQ indicators (Sect. 2.1) shows that there are several divergences in defining the attributes due to the contextual nature of the quality, also remarked by Batini et al. (2009). However, several basic indicators and definitions are included in all classifications. Those indicators are accuracy, completeness, consistency, and timeliness. In Table 1, several different terms are associated with each of these indicators. In some cases, terms are synonyms and are used in different contexts. In other cases, they are representative of various aspects of the indicator. For example, accuracy and precisions may be considered as the different aspects of correctness. Trustworthiness Färber et al. (2017), may englobe believability, objectivity, and reputation; completeness may include the appropriate amount of data and value-added; interoperability may contain interpretability.

  2. 2)


    Using this approach, indicators with similar meanings were identified and clustered. Therefore, the indicators free of error, integrity, and reliability are clustered under correctness. Believability, objectivity, objectivity, reputation, and traceability are grouped under trustworthiness. Concise representation, interpretability, ease of manipulation, and understandability are assembled under interoperability. License is clustered under accessibility. The appropriate amount of data, value-added, and richness of information are gathered under completeness. Compliance is represented under consistency. Currentness, age, and expiration are clustered under timeliness. Finally, duplication is replaced by redundancy.

    For each of these clusters, one of the indicators (e.g., trustworthiness, timeliness) was chosen as representative and comprehensive, and the other indicators were therefore excluded from the list. Then, the indicators that describe different aspects of the same attribute (for example accuracy, precision, consistency, and correctness) were grouped together. Next, one was chosen as representative of the attribute (correctness in the example), and the others are considered sub-indicators. In some cases, the definition of the indicator has been adapted from Table 1 to express its wider meaning, which also includes that of the sub-indicators. This procedure resulted in the selection of six indicators, and relevant sub-indicators reported in Table 2 along with the selected definitions.

  3. 3)


    Finally, taking as a basis and extending the work of Rodríguez and Servigne (2013), the six indicators were classified according to three layers of data management: acquisition, processing and sharing, and supporting decisions. This classification (presented in column 1 of Table 2) highlights aspects of DQ that are more relevant to each phase of data managemen

Table 2 Data quality indicators and sub-indicators

In this article, the most possible complete list of indicators is offered. Some indicators such as interoperability and traceability sometimes will not be of great interest or will not greatly influence the decision-making except in very specific situations. Therefore, the weights were offered in Sect. 5.1.1. For each context, the decision-maker must choose what is of interest to his situation. Moreover, the decision-maker can consider that those weights are zero or that they are prerequisites and not use them at all.

The indicators suggested are related to the monitoring system and the structural health monitoring system because they tackle indicators for both the data and information (Sect. 1). For example, some of the variables considered were the recording of the ambient noise (i.e., the data) and the extracted frequency (i.e., the information) which is a structural parameter. The indicators can also be used for other inspection methods. Moreover, those indicators are used in further studies for the decision-making of structural health monitoring systems.

Finally, some indicators such as redundancy and precision were given more suitable definitions for the SHM context herein. For example, precision is explained further as how close the measurements of the same parameter or variable are close to each other. While redundancy is explained further as several experiments/measurements for the same parameter or variable are available.

Moreover, one can consider in the future other indicators, if needed, such as indicators related to the extraction of information with existing data), or indicators for the lack of data of interests (or incomplete data of interests).

4 State of art of the Metrics

In this section, a brief survey of the metrics proposed in the literature for the DQ indicators selected in Table 2 is reported. Deterministic and probabilistic approaches to defining DQ metrics were proposed in the literature.

4.1 Deterministic approach

The deterministic metrics are used broadly and mainly to assess DQ. The approach can be divided into general metrics methods to assess the indicators and specific metrics for some indicators.

4.1.1 General metrics method for the indicators

Pipino et al. (2002) presented interesting deterministic approaches for determining the DQ metrics. They offered three functional forms: simple ratio, min or max operation, and weighted average. Those approaches were later utilized and/or adapted as appropriate by Färber et al. (2017) and Vetrò et al. (2016).

Simple ratio is the ratio between the number of data values of the considered attribute (e.g., accuracy) and the total number of data values. This definition is applied for the metrics of indicators that have a unique definition such as free of error, completeness, consistency, concise representation, relevancy, and ease of manipulation. Färber et al. (2017) and Vetrò et al. (2016) adopted a simple ratio approach to defining the metrics, which thus have values in the range of 0 to 1. For example, Vetrò et al. (2016) proposed, for data organized in Excel sheet format, metrics be defined as the percentage of data over the dataset available in the sheet. An accuracy value of 0.7 (or timeliness) means that 70% of data is accurate (timely). Whereas a value of 0 and 1 means that all data are respectively inaccurate (untimely) or accurate (timely).

In other data types, a discrete definition of the metrics can be more appropriate. Färber et al. (2017) proposed discrete DQ metrics for knowledge graphs, and KGs (DBpedia, Freebase, etc.) to compare different KGs, and find the most suitable for a given set. For example, for “Trustworthiness on statement level” (one of the trustworthiness metrics), they suggested that “Trustworthiness on statement level” = 1 if provenance on statement level is used; 0.5, provenance on resource level is used; and 0 otherwise.

Similar Rodríguez and Servigne (2013) proposed a simplified method in which the quality of each indicator is assessed using subjective scores (e.g., 0 = very low; 0.25 = low; 0.50 = medium; 0.75 = high; 1 = very high).

Min or max operations are used for indicators that can be defined by several attributes such as believability (i.e., believability concerning a common standard, believability of the data source, or believability based on experience). In these cases, the metrics corresponding to the different attributes are normalized to make them comparable. Then a min or max operator is applied to compute a comprehensive metric of the indicator. The min operator is conservative: the metric corresponds to the quality of the weakest attribute. The max operator is usually used for time-related indicators (such as timeliness and accessibility) to exclude negative values of the indicator that correspond to data prior to their validity period. In such cases, the metric is assumed to be zero. An example of a metric as minimum value is the one used for the indicator “appropriate amount of data”. It is defined as the minimum between the ratio of the necessary data to the data provided and its inverse Pipino et al. (2002).

The weighted average is an alternative to the min/max operator where multiple aspects of the attribute need to be combined. It describes the importance of the different aspects of the DQ attribute Pipino et al. (2002). However, it requires a good understanding of the value of each variable in the overall assessment of a dimension. Hence, it is better to be assessed by an expert who will assign a weighting factor between zero and one, and make sure the sum of the weights equals one. This operator can be used, for example, to weigh the various aspects of the believability mentioned above differently.

The min/max and weighted average operators may be useful, not only to define the metrics for individual DQ attributes but also to combine different attributes (for example, the sub-indicators in Table 2, to obtain an appropriate indicator metric).

4.1.2 Specific metrics for some of the indicators

Some metrics, such as completeness and timeliness, received a great deal of attention because of their importance to the DQ assessment and therefore to the decision-making process that follows.


Completeness is defined as the extent to which all required data is available within the dataset. In Blake and Mangiameli (2011) the completeness of a dataset of size N is defined as the simple ratio between values that are not missing (N-Nm) and the total value N:


Where Nm is the number of missing data in the dataset.


Another example is Timeliness, which is defined as the maximum value between 0 and the value Ballou et al. (1998). Timeliness is defined in terms of “currency” and “volatility”.


where currency indicates the age of the data once it becomes available to the user. T is the volatility (or shelf-life) that represents the total duration for which data remain valid. The exponent s is case-dependent and controls the sensitivity of the metric to the ratio between the currency and volatility. When the age of data exceeds the shelf-life the metric becomes negative, and it is assumed to equal 0.

4.2 Probabilistic approach

The deterministic approach is appropriate where the managed data represent phenomena unaffected by uncertainty. When the data is affected by uncertainty, then the probabilistic approach provides more suitable metrics to account for it. For example, when the data is a measurement of physical quantities, such as SHM measurements, probabilistic metrics are more suitable. In the literature, probabilistic metrics were identified mainly for the two indicators timeliness and consistency.


Based on the deterministic definition of shelf-life in Ballou et al. (1998), Heinrich et al. (2007b) proposed a probabilistic metric assuming the shelf-life T is exponentially distributed. This distribution is memoryless. This means that the probability that the data becomes outdated in the next period of time is independent of its current age. Thus, the probability that at time t the data are still valid (i.e., the shelf-life is greater than t) is given by:

$${Q}_{Time}\left(T\right)=P\left(T\ge t\right)=1-F\left(T\right)={e}^{-\lambda \bullet T}$$

where the parameter \(\lambda\) indicates the rate of decline of the data per unit of time. A value of \(\lambda =0.2\) indicates that the shelf life decreases by 0.2% on average per unit time.

The density function corresponding to this cumulative distribution is:

$$f\left(t\right)= \left\{\begin{array}{c}\lambda \bullet {e}^{-\lambda \bullet t} if t\ge 0\\ 0 else\end{array}\right.$$

Hence the probability of the data being outdated at time T (i.e., the shelf-life is less than T) is

$$P\left(T\le t\right)= F\left(T\right)= {\int }_{0}^{T}f\left(t\right)dt = 1-{e}^{-\lambda \bullet T}$$

Heinrich and Klier (2009) and Heinrich and Klier (2015) are an extension of this metric for the case if additional data become available. The metric is then defined as the probability that the shelf life is greater than the current time conditional to the additional data.


The same authors also proposed a probabilistic definition of consistency where this quality is defined with respect to an uncertain rule Heinrich, Klier, et al., (2018). The definition of consistency is the degree to which data is free of internal contradictions with regard to a rule. This rule can be certain, e.g., “true by definition”, or uncertain. In the former case, consistency is defined by a binary outcome (consistent or not). In the latter case, it is defined in probabilistic terms as the probability of consistency. Consider DB a database containing a set of n records T = {t1, t2, …, tn} and a set of attributes a = {a1, a2, …, an}.

For uncertain rules, the measurement consistency is defined in probabilistic terms as the probability p that the uncertain rule R is fulfilled. Since there are only two possibilities (either the rule is fulfilled with a certain probability or it is violated with the complementary probability), the consistency of a measurement can be modeled as a Bernoulli trial Be(p) with parameter p. This distribution has an expected value p and a standard deviation p(1-p).

When the rule R is applied to all measurement tj in the database DB of n measurements, the database consistency can be quantified by the sum of the consistencies of the individual measurements. Being the sum of n independent Bernoulli distributed random variables, the consistency of the database follows a binomial distribution B(n, p) with the expected value np and the standard deviation np(1-p).

Heinrich, Klier, et al., (2018) were looking for extreme values, i.e., values that are equal to or more extreme than the observed value (v) or the successes presented by p(X(r)) ≥ v. Since the Binomial distribution is symmetric, it may also represent the extreme values for the lower boundary. This case is captured by the two-sided p-value. Therefore, the consistency in Heinrich, Klier, et al., (2018) was represented by the two-sided p-value, which can be expressed as follows:


5 Metrics for SHM Data Quality indicators

In this section, deterministic and probabilistic metrics are presented for each DQ indicator and sub-indicators for the SHM Context.

General deterministic DQ metrics are proposed using two approaches and then a global metric for the SHM DQ is assigned. This approach is simple and practical, less consuming in time and computation, does not consider uncertainties, and thus is less costly. It is especially useful, in the case limited knowledge is available on the parameters of the probability distribution functions. Also, it is practical when time, expertise, or money are not available for more elaborated probabilistic modelling.

Probabilistic DQ metrics are proposed as probability distribution functions. This method considers the uncertainties of the data. It is time-consuming and more costly as it is essential to recruit a probabilistic modelling expert to invest in advanced DQ models.

5.1 Deterministic Metrics for SHM Data Quality indicators

A general deterministic approach is used to suggest DQ metrics. It assigns metrics as scales ranging from 0 to 1. Where 0 is the lowest quality and 1 is the highest quality for each DQ metric. The metrics are by discrete scales based on the method in Färber et al. (2017) and continuous scales based on the method in Vetrò et al. (2016) as follows.

  • Discrete scale

    The metric has at least two discrete values, for example:

$$\text{m}\text{e}\text{t}\text{r}\text{i}\text{c}=\left\{\begin{array}{cc}1& yes\\ 0& no\end{array}\right.$$
$$\text{m}\text{e}\text{t}\text{r}\text{i}\text{c}=\left\{\begin{array}{cc}1& yes\\ 0.5& partially\\ 0& no\end{array}\right.$$

where “yes” means that the data have the invested quality: for example, they are correct (for correctness), complete (for completeness), …; “no” means that they do not possess such qualities; and “partially” means that the data partially present the quality under consideration, with a certain rate (e.g., 0.5).

  • Continuous scale

    The metric is defined as the percentage of data that is accurate (or timely or complete, …). For example, if the data is in the form of a set of measurements (i.e., M1, M2, M3, M4), two of them are accurate, then 50% of the data is accurate.

5.1.1 The global metric for the SHM data quality

Once the DQ metric is obtained for each indicator, it is also useful to consider a global metric where each indicator is assigned some weights. The following method is suggested for computing the global metric.

Let DQu represents a single quality (i.e., indicator) and DQTotal the value of the global DQ metric.

The decision-maker can use the DQu or DQTotal values respectively when he is interested in assessing a single quality (e.g., accuracy, completeness, etc.), or the global metric.

The DQTotal is determined using the following equation:

$${\text{D}\text{Q}}_{Total} = \sum\nolimits_{u}{\omega }_{u}\bullet {DQ}_{u}$$

Where ωu are the normalized weights between 0 and 1 with and \(\sum _{u}{\omega }_{u}=1\) and computed as follows:

$${\omega }_{u} = \frac{{S}_{u}}{\sum _{u}{S}_{u}}$$

The decision-maker (expert) is required to assign scales, Su, for each indicator based on the importance assigned to that indicator for the decision-making case under consideration.

Moreover, for completeness, one might also tailor the following formula for different variables with several weights.

$${\text{C}}_{\text{T}\text{o}\text{t}\text{a}\text{l}} = \sum\nolimits_{\text{x}}{{\upomega}}_{\text{Cx}}\bullet {\text{C}}_{\text{x}}$$

Where Cx is the completeness of the considered data/information variable of interest and ωCx is the relative weight assigned to it, based on the relative interest of the data to the decision maker.

5.1.2 The thresholds for the global metric of data quality

To characterize the metric of the global data quality, labels and thresholds are proposed. The labels vary from excellent to very weak, respectively, and the data quality varies from 1 to 0 with 5 levels of different threshold ranges as per Table 3 proposed below.

Table 3 Data quality labels and thresholds ranges for the global metric

5.2 Probabilistic Metrics for SHM Data Quality indicators

In this approach, probability distribution functions are used to define and assess probabilistic metrics of indicators that are considered to incorporate uncertainties.

The following indicators are assigned a probability density function for their probabilistic metrics: Accuracy, Precision, Consistency, Redundancy, Accessibility, Timeliness, Completeness, and Relevancy.

5.2.1 Probability density functions for the SHM data quality metrics

Metrics, especially in an uncertain context, depend on the type of data and related uncertainties. In the SHM context, the probability distribution function assigned for probabilistic metrics is highly dependent on the flow of the data.

As stated, there are two possible SHM data types related to the data flow. The flow may be scarce in the case of occasional SHM measurements, or it may be abundant in the case of permanent SHM measurements. Thus, the data can be discrete (i.e., for occasional measurement) or continuous (i.e., for permanent measurements), and henceforth different probability density functions are suggested.

  1. a)

    Occasional measurements (i.e., discrete)

    In case the measurement is occasional, the quality is assessed from time to time whenever a measurement is recorded. It can thus be reflected by a series of discrete values for the random variable. If the realizations of the time-variant quantity occur at a discrete time, then the random quantity is denoted random sequence. And thus, the Bernoulli trial sequence leads to the use of the binomial distribution Faber (2012). In this case, the metric is represented by a binomial distribution: metric ~ B (n, p) with the number of successes n, and the probability of successes p.

  2. b)

    Permanent measurements (i.e., continuous)

    If the measurement is permanent or continuous, it may be reflected with a series of continuous values for the random variable. Here the realizations of the time-variant quantity occur continuously over time, and the random quantity is denoted random process or stochastic process. Thus, the Normal or Gaussian process is used Faber (2012). In that case, the metric is represented by a normal distribution function: metric ~ N (µ, σ) with the mean µ, and the standard deviation σ.

In the coming, the probabilistic metrics suggested for SHM are offered.

5.2.2 Accuracy

The Real-world (R) and the Measurements (M) are represented by random variables. Accuracy reflects how the measurement M, which is a random variable, is close to the Real-world R.

A feasible metric of the accuracy of the measurement can be defined as the difference between the two random variables as follows:


In the assumption of normal distributions R ~ N (µR, σR), and M ~ N (µM, σM), then the mean value and standard deviation of accuracy A are:

A ~ N (µA, σA) with µA = µR − µM and\({\sigma }_{A} = \sqrt{{\sigma }_{R}^{2}-{\sigma }_{M}^{2}}\)

As detailed above, if the data is recorded permanently, it is normally distributed N (µM, σM). And if it is recorded occasionally, then it follows a binomial distribution B (nM, pM).

Accuracy is the accuracy of data and information. It is the distance between the measurement and the real-world, it is the opposite of the error as well (detailed further in Sect. 6.3.1).

Moreover, for example, in the case when the “structure modifies its properties”, the accuracy is still the distance between the new measurement and the new real-world.

Data analysis is reliant on data and specific to case studies. Thus, for other future case studies considering, for example, the detection of the failure of a sensor from the damage to the structure, then the indicators and metrics will be considered to account for the specificity of the application.

5.2.3 Precision

Precision is the degree to which the measured values are close to one another. Thus, the metric of precision is the standard deviation σd of the distribution of the measurement.

For permanent data, then for a random variable X with density f(x), µ is the expected value (the average) defined as:

$$\mu ={\int }_{-\infty }^{+\infty }xf\left(x\right)dx$$

And the standard deviation σ of X is defined as:

$$\sigma =\sqrt{{\int }_{-\infty }^{+\infty }{\left(x-\mu \right)}^{2}f\left(x\right)dx}$$

For the occasional data, then the random variable X takes a finite data set x1, …, xn with constant probabilities, and µ is defined as:

$$\mu =\frac{1}{N}\sum\nolimits_{i=1}^{N}{x}_{i}$$

And the standard deviation σ of X is defined as:

$$\sigma =\sqrt{\frac{1}{N}\sum\nolimits_{i=1}^{N}{\left({x}_{i}-\mu \right)}^{2}}$$

5.2.4 Consistency

Heinrich, Klier, et al., (2018) inspired the SHM consistency metric herein, however, it was modified to serve the SHM context. Heinrich, Klier, et al., (2018) looked for consistency in rare extreme values i.e., values that are equal to or more extreme than the observed value (v) or the successes presented by p(X(r) ≥ v.

In this article, the consistency metric aims to represent the number of successes for the selected rule, i.e., the largest number of consistent values for the chosen rule. Therefore, in this case, consistency is defined as the probability that the uncertain rule R is fulfilled with a probability p. As explained (Sect. 4.2), for one record it is expressed as a Bernoulli trial Be(p) with parameter p. And when the rule R is applied to all measurement tj in the database (of n measurements), the consistency of the database follows a binomial distribution B(n, p).

However, SHM metrics are dependent on the SHM data flow (Sect. 5.1.1). Therefore, for punctual measurements, then for a certain rule r, the consistency is given by:


Where nc, is the number of successful measurements and pc is the probability of success.

For permanent SHM measurements, the Bernoulli distribution, as usual, tends to be a normal distribution, and thus the consistency is given by:


Where µc, is the mean and σc is the standard deviation.

5.2.5 Timeliness

The SHM data metric timeliness is represented by two very distinct cases (Sect. 5.1.1) that depend on the flow of the data i.e., permanent, or occasional measurements.

In the case of occasional monitoring, where the data is rare and represented by a discrete variable, the formula suggested by Heinrich et al. (2007b) can be used:

$${Q}_{Time}\left(T\right)={e}^{-\lambda \bullet T}$$

Where λ represents the number of times the measurement values become outdated on average over a period of time.

In the case of permanent data, then QTime = 1. Here the shelf-life tends towards zero (T→0) because data is continuously updated, and the decline rate tends to infinity (λ→∞). This leads to the data being always up-to-date, thus QTime = 1.

5.2.6 Redundancy, accessibility, relevancy, and completeness

The metrics redundancy (R), accessibility (Ac), relevancy (Re), and completeness (C) are assessed simply according to the number of successes of that DQ each time it is evaluated by the expert. The expert, once the data is available, can assess whether they possess the required quality (i.e., complete, relevant, redundant, and accessible).

There are only two possibilities (either the quality is fulfilled with a certain probability, or it is violated with the complementary probability). Thus, the quality can be modelled as a Bernoulli trial Be(p) with the parameter p representing the probability of success of this quality.

For all measurements, the quality of the database can be quantified by the sum of the qualities of the individual measurements. Being the sum of n independent Bernoulli distributed random variables, the quality of the database follows a binomial distribution B(n, p) with the expected value np and the standard deviation np(1-p).

Moreover, two cases are also available for these metrics based on the data flow (i.e., occasional, or permanent).

In the case of occasional measurements, the data are discrete, and thus the Binomial distribution is appropriate. However, in the case of permanent measurements, the data are continuous and thus the binomial distribution tends to the normal distribution. Therefore, the quality of completeness, relevancy, redundancy, and accessibility, is represented by a normal distribution).

Table 4 resumes the suggested SHM probabilistic metrics.

Table 4 The probabilistic DQ metrics for permanent and occasional SHM

6 Generic example for the SHM data quality metrics

A 5-spans concrete bridge benefiting from permanent acceleration and occasional ambient noise vibration measurements is considered. It is equipped with accelerometers that continuously are recording and transmitting data wirelessly to the processing site. Accelerations are processed to compute the fundamental frequency among other dynamic characteristics (modal shapes, etc.). Moreover, every five years an ambient vibrations measurement campaign is done using seismometers, to recheck the modal frequency extracted from the accelerations. The equipment and experiments are given in Table 5.

The reasons behind selecting these SHM experiments are: (1) the availability of redundant measurements to recheck the frequencies calculations; and (2) the availability of different types of data (i.e., continuous for permanent and discrete for occasional ones).

Table 5 The SHM equipment and experiments

6.1 Decision Problem

To stress the importance of SHM DQ for the decision-making process, a decision problem is proposed. Suppose the bridge owner wants to decide whether to strengthen the bridge facing a seismic threat. The decision depends on the data collected on the bridge and its quality. Thus, it is necessary to include it.

For example, in the case of the permanent measurement (Fig. 1), the frequency data should vary around the initial fundamental frequency of 3 Hz according to the finite element model (FEM) and the initial frequency measurement. However, it has been found, that it is varying around 2 Hz.

Similarly, for the occasional ambient vibration test, at some point, the measurement and computation of the fundamental frequency are performed. In this case, the data should drop every 5 years at most around 5% (or 0.15 Hz) relative to the initial frequency of the structure according to the FEM and the initial frequency measurement (i.e., FEM deterioration modelling and experience). However, it was observed to have dropped to 2 Hz (Fig. 2, where fpi are the predicted frequencies and fmi are the measured frequencies).

Fig. 1
figure 1

Frequency continuously computed from continuously recorded accelerations

Fig. 2
figure 2

Frequencies computed from occasionally recorded velocities

In both cases, the bridge owner needs to know if the frequency drop is due to damage to the structure or the quality of the measurement. If a decision and action are taken without verification of DQ, two scenarios are possible. Scenario-1 of high costs, and minimal risk, the owner can intervene and strengthen the structure, while it is in a good state, and thus loses a large amount of money. Scenario-2, the owner may not intervene to repair the structure, while it is damaged. Then, later, if the bridge collapses, the owner will have to face a tremendous cost related to the casualties, downtime, and replacement of the bridge. Immediately following the decision on Scenario-2, the cost is minimal, and the risk is high.

6.2 Deterministic metrics calculations

For the bridge in question, the DQ metrics were assessed by an expert for both the permanent and occasional monitoring systems using the deterministic method proposed in Sect. 5.1. Discrete (i.e., 1 is assigned for “yes” the data has a quality and 0 for “no” does not) and continuous deterministic methods (i.e., the percentage of data having this quality) were used.

A slight difference is noted in occasional and permanent monitoring when using continuous scales. For the discrete scale, no difference was noted between permanent and occasional monitoring, as the discrete scale rate the metrics simply by 0 or 1. On the other side, more refined values are noted for the continuous scale, because a percentage provides a more precise metric value of DQ.

Table 6 shows the values for the 4 cases (discrete and continuous scales and occasional and permanent monitoring).

The table suggested in Sect. 5.1.2 for the labels and thresholds for the data quality global metric is used. Since all the assessed DQTotal metrics are greater than 0.8, thus, the data is considered of excellent quality.

Table 6 Discrete and continuous scale DQ metric values for the occasional and permanent monitoring

Thus, for example, for both cases where the frequency drops by 5% (i.e., for when no damage) and 30% (i.e., for when severe damage occurs), the aim here is to assess the data quality before deciding whether to go or not on the bridge. Because maybe there are not many funds to go on the bridge or the expert cannot access it easily, etc. Thus, the first step is to assess data and information more thoroughly and then act accordingly based on it. Moreover, if several sensors exist on the bridge one can check whether there is a problem with one of the sensors first. Finally, now that the data is assessed to be of good quality, thus, in the case where it dropped by 5%, the expert will not need to go to the field as there is likely no damage to the structure. While in the case when where it dropped by 30% the expert will need to go to the field for further inspections as it is likely that there is damage to the structure.

In some cases, experts conduct an extensive campaign of measurements which can be the result of an assessment of data quality. However, this is not the aim here. We do not aim to check the data quality by doing more tests (i.e., other than the scheduled one). This can be interesting, however, not always feasible, due to fund limitations. We aim to assess the data quality of the existing monitoring systems and the already scheduled monitoring campaigns, using some indicators and metrics.

6.3 Probabilistic metrics assigned probability distribution function

To apply the metrics, it is crucial to assign parameters for each of the suggested probability distribution functions and in both cases of permanent and occasional monitoring.

Three possibilities are available to determine the parameters for the probability distribution functions.

  1. 1)

    Analysing a reference dataset. It is a promising option in case the reference dataset is of good quality for the current study. For example, a bridge of the same typology with a large amount of monitoring inspections and DQ assessment history, i.e., the bridge has already been inspected over an extended duration and the DQ of the SHM measurements has been assessed. Also, it could be the bridge itself with a long history of monitoring and DQ assessment.

  2. 2)

    Conducting a study. A series of on-site campaigns may take place for a series of bridges (or one) with a similar typology. Moreover, it can be done on the bridge itself, for some time and then deduce the parameters.

  3. 3)

    Surveying experts (i.e., surveying qualified individuals).

The assessment of the SHM DQ has not yet been done. Thus, no reference database set is yet available to the authors of this article. Moreover, there is no way to conduct a study rapidly because it is essential to be spread over a sufficient duration of several years. Therefore, in this article, the third expert-based approach is used. Until, in the future, other SHM DQ assessments are conducted to provide robust experimental parameters for the probability distribution functions.

6.3.1 Accuracy

The metric for the measurement accuracy is given by A = R – M (Eq. 11). Where R and M are random variables of the real world and measurement fundamental frequency.

However, it is difficult to obtain the exact value of the real-world parameter unless, for example, a very high-quality digital twin model is installed. Therefore, an additional simplification is introduced to take account of this metric. Instead of assigning probability distributions to the parameters of the real-world and measurements, it will be assigned directly to the accuracy utilizing the following manoeuvre.

The measurement is expressed as the real-world parameter value added to the measurement error (M = R + ε). Thus, the accuracy is the opposite of the error (ε) and is expressed by:


The probability distribution function attributed to the inspection error is assigned a normal distribution N (µε, σε) with its mean µε and its standard deviation σε. Finally, the accuracy is A ~ N (-µε, σε).

Based on Brincker and Ventura (2015), Zhang et al. (2022), Ali et al. (2019), and Peng et al. (2021), the expert suggested a value of 0 for the mean of the error and 0.5 for the standard deviation. Moreover, for simplicity, the normal distribution is attributed to the error regardless of the data flow. Thus, the accuracy can be considered the same for permanent and occasional monitoring.

6.3.2 Precision

The precision metric is the standard deviation of the distribution of the measurement. Therefore, based on Peng et al. (2021), the expert suggested a value of 0.03 for the standard deviation σd.

6.3.3 Consistency

In this example, DB is a database containing a set of n SHM measurement records T = {t1, t2, …, tn}, of different lengths, all sampled at 100 Hz. And a = {a1, a2, …, an} is a set of attributes (i.e., duration of the record, number of points of the record, the max amplitude of the record, signal-to-noise ratio (S/N), etc.). Let’s consider the rule “record longer than 10s have a maximum amplitude 0.1 g”.

For the metric consistency, the expert estimated that the SHM DQ was assessed 5 times. The data was only considered to be inconsistent once, with one record having a maximum amplitude greater than 0.1 g. Thus, for the occasional measurement, it has n = 4 and p = 0.8, and thus Co ~ B (4, 0.8). While for the permanent measurement the assessment offered µr of 0.8 and σr of 0.39, thus Co ~ N (0.8, 0.39).

6.3.4 Timeliness

For occasional monitoring, and the duration of the 20 years monitored period, the expert considered that the decline rate is λ = 0.2. On average, 20% of the SHM data loses its validity over a given (i.e., 1/5).

$${Q}_{Time}\left(T\right)={e}^{-0.2\bullet T}$$

For permanent monitoring, then, QTime = 1.

6.3.5 Redundancy, accessibility, relevancy, and completeness

For the metrics redundancy, accessibility, completeness, and relevancy the expert estimated that the SHM DQ was assessed 5 times.

Only once was the data found to be non-redundant because of the lack of occasional measurements. Thus, for the occasional measurement, it has n = 4 and p = 0.8, and thus R ~ B (4, 0.8). Whereas for the permanent measurement, the assessment offered µr of 0.8 and σr of 0.39, thus R ~ N (0.8, 0.39).

For accessibility, the expert found that the data were only found to be inaccessible twice due to impediments to reaching the site. Thus, for the occasional measurement, it has n = 3 and p = 0.6, and thus R ~ B (3, 0.6). Whereas for the permanent measurement, the assessment provided µr of 0.6 and σr of 0.46, thus Ac ~ N (0.6, 0.46).

For completeness, the expert found the data to be incomplete only once due to the absence of some measurements. Thus, for the occasional measurement, it has n = 4 and p = 0.8, and thus C ~ B (4, 0.8). Whereas for the permanent measurement, the assessment offered µc of 0.8 and σc of 0.39, thus C ~ N (0.8, 0.39).

For relevancy, the expert found that the data was three times irrelevant to the decision problem at hand. Thus, for the occasional measurement, it has n = 2 and p = 0.4, and thus Re ~ B (3, 0.4). Whereas for the permanent measurement, the assessment offered µc of 0.8 and σc of 0.39, thus Re ~ N (0.4, 0.41). Table 7 resumes the SHM probability distribution assigned to the metrics with adequate parameters.

Table 7 Probabilistic DQ data quality metrics for permanent and occasional SHM

7 Real case example of the Z24-bridge

The bridge located in Switzerland benefited from the environmental monitoring system and accelerometers and measurements were recorded for nearly one year including the one last month when the bridge was intentionally damaged. The experiments were detailed extensively in (Peeters and De Roeck 2001), (Reynders and Roeck 2008), (Reynders et al. 2012), (Langone et al. 2017), and (Maeck et al. 2001) as well as the bridge which is presented in Fig. 3. Additional details of the KU Leuven Z24 Project are found at (Leuven, n.d.)

In Fig. 4, the data for the first three extracted frequencies is shown, where you can also notice, the values for the undamaged data, undamaged data with low temperature thus not very accurate and the damaged data (i.e., the data for when the bridge deteriorated). Further details are offered in (Omori Yano et al., 2022) and in (Santos et al., 2017).

Fig. 3
figure 3

The longitudinal section of the Z24- bridge and its top view (Peeters and De Roeck 2001)

Fig. 4
figure 4

The frequencies for modes 1, 2, and 3 for the one-year observations

For the experiment done, the data is considered fully accessible, interoperable, traceable, and relevant. As the experiment was permanently recorded and thus the data is timely.

For the accelerometer recordings, the data is considered redundant as there are many sensors for the one experimental set, and the security is considered the lowest since it was not secured.

The cold period caused the operational variation of frequencies, and this was for a total of 909 observations out of 5012. However, the accuracy and precision were preserved as the system was able to detect it.

For consistency regarding the rule being “for T > 0 the standard deviation of the frequency modes is < 0.15”, this rule was verified based on the results Peeters and De Roeck (2001). All the data verified this rule; thus, the consistency with respect to this rule is considered as 1.

Finally, the recordings were interrupted from day 166 to day 200 out of the total 304 days of recordings. For the total duration of 304 days, the completeness is 88% (in this case the total data quality is 0.967). However, if one is considering only the 270 days of the tests, the completeness is 100% (in this case the total data quality is 0.982).

Those values were filled in DQu column of Table 8. The indicators scores from 0 to 1 were again offered for this specific test and reported in the Su column of Table 8.

Finally, with similar computations to the previous example, we obtained that the total data quality is 0.982. Based on Table 3 (Sect. 5.1.2), since the assessed DQTotal metric is greater than 0.8, thus, the data is considered of excellent quality.

Those assessed values helped as well provide the parameter values for the considered distribution of the probabilistic method as shown in Table 9. The error was considered to have a mean of zero and a standard deviation of 1 and the precision was considered for the first mode frequency computed as 0.1.

Table 8 Continuous scale DQ metric values for the permanent monitoring of the Z24 bridge
Table 9 Probabilistic DQ data quality metrics for permanent SHM of the Z24-bridge

8 Results and conclusions

This article investigates the data quality and fills the gap in indicators and metrics in the SHM field.

To this end, this article reviews extensively the DQ indicators and their metrics. Six SHM DQ indicators are selected for data management phases and are assigned sub-indicators as needed to capture different aspects. Then their appropriate definitions are suggested.

The article proposes deterministic metrics in the form of discrete and continuous scales. It then offers probabilistic metrics to account for uncertainties. Additionally, different probability distribution functions for permanent and occasional SHM are offered. This provides a better understanding of the data stream influence on the selected probability distribution functions.

A generic example of SHM DQ metrics is provided and the results were presented for both deterministic and probabilistic metrics. For the deterministic metrics, the continuous scale was found to offer more refined results than the discrete scale. For the probabilistic metrics, the parameters for the probability distribution functions were assigned. Based on the DQ assessment, the data is found to be of good quality and therefore the bridge owner can make his decision without further investigations. Furthermore, a real case study of the assessment of the Z24 bridge was presented.

Finally, this article is the first step in a series of studies aimed at improving decision-making for structural integrity management context by incorporating the SHM DQ. A variety of case studies and a great number of data will need to be considered in the future to tackle all different aspects of the indicators and metrics.

In future studies, indicators and metrics might be offered to account for the reliability of the monitoring system and SHM system Etebu and Shafiee (2018), Shamstabar et al. (2021). Similarly, indicators and metrics might be suggested for the environmental effects on the dynamic characteristics Worden and Cross (2018), Cross et al. (2011), Brownjohn et al. (2009). Moreover, indicators and metrics can be assessed not only for data and information variables suggested here but for other variables such as damping, the date of the measurement, etc. regardless of the utility of this for decision making. The choice will be made for each case study and application based on the decision-making situation and context.

Availability of data and materials

The data will not be shared as it was instructed for me to work on it for this article.


Download references


I thank Politecnico di Milano for funding this research through the Seal of Excellence project and research program CYBERES. I thank the KU Leuven researchers, specifically Prof. Guido De Roeck and his research team for their excellent work and data provided for helping the scientific community. I thank Prof. Eloi Figueiredo and Marcus Omori Yano for sharing the data on the Z24 bridge.


This work was funded by Politecnico di Milano through the Seal of Excellence project.

Author information

Authors and Affiliations



The author read and approved the final manuscript.

Corresponding author

Correspondence to Nisrine Makhoul.

Ethics declarations

Competing interests

I declare that there is no conflict of interest.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Makhoul, N. Review of data quality indicators and metrics, and suggestions for indicators and metrics for structural health monitoring. ABEN 3, 17 (2022).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: