Structural Health Monitoring

Viewer
Transcript

Structural Health Monitoring http://shm.sagepub.com/

Robust dimensionality reduction and damage detection approaches in structural health monitoring Nguyen LD Khoa, Bang Zhang, Yang Wang, Fang Chen and Samir Mustapha Structural Health Monitoring 2014 13: 406 originally published online 8 May 2014 DOI: 10.1177/1475921714532989 The online version of this article can be found at: http://shm.sagepub.com/content/13/4/406

Published by: http://www.sagepublications.com

Additional services and information for Structural Health Monitoring can be found at: Email Alerts: http://shm.sagepub.com/cgi/alerts Subscriptions: http://shm.sagepub.com/subscriptions Reprints: http://www.sagepub.com/journalsReprints.nav Permissions: http://www.sagepub.com/journalsPermissions.nav Citations: http://shm.sagepub.com/content/13/4/406.refs.html

>> Version of Record - Jul 7, 2014 OnlineFirst Version of Record - May 8, 2014 What is This?

Downloaded from shm.sagepub.com by guest on July 7, 2014

Special Issue Article

Robust dimensionality reduction and damage detection approaches in structural health monitoring

Structural Health Monitoring 2014, Vol. 13(4) 406–417 Ó The Author(s) 2014 Reprints and permissions: sagepub.co.uk/journalsPermissions.nav DOI: 10.1177/1475921714532989 shm.sagepub.com

Nguyen LD Khoa1, Bang Zhang1, Yang Wang1, Fang Chen1 and Samir Mustapha2

Abstract Structural health monitoring has been increasingly used due to the advances in sensing technology and data analysis, facilitating the shift from time-based to condition-based maintenance. This work is part of the efforts which have applied structural health monitoring to the Sydney Harbour Bridge – one of Australia’s iconic structures. It combines dimensionality reduction and pattern recognition techniques to accurately and efficiently distinguish faulty components from wellfunctioning ones. Specifically, random projection is used for dimensionality reduction on the vibration feature data. Then, healthy and damaged patterns of bridge components are learned in the lower dimensional projected space using supervised and unsupervised machine learning methods, namely, support vector machine and one-class support vector machine. The experimental results using data from a laboratory-based building structure and the Sydney Harbour Bridge showed high feasibility of applying machine learning techniques to dimensionality reduction and damage detection in structural health monitoring. Random projection combined with support vector machine significantly reduces the computational time while maintaining the detection accuracy. The proposed method also outperformed popular dimensionality reduction techniques. The computational time of the method using random projection can be more than 200 times faster than that without using dimensionality reduction while still achieving similar detection accuracy. Keywords Structural health monitoring, dimensionality reduction, random projection, damage detection, support vector machine

Introduction Ageing or damage caused by operation and environment is an inevitable process for any civil structure such as bridges and buildings. Early identification of damage in a structure is important to avoid further risks, both in life-safety and economic losses. Structural health monitoring (SHM) has been increasingly used due to the advances in sensing technology and data analysis, facilitating the shift from time-based to condition-based maintenance. Many articles have been published in this subject recently, either in model-driven approaches or in data-driven approaches.1–4 A typical model-driven approach in SHM adopts a numerical model of the structure, usually based on finite element analysis, which relates differences between measured data and the data generated by the model to the damage identification.5 However, a numerical model may not always be available in practice and does not always correctly capture the exact

behaviour of the real structure. On the other hand, a data-driven approach establishes a model by learning from measured data and then makes a comparison between the model and measured responses to detect damage. This approach uses techniques in pattern recognition, or more broadly, in machine learning.4 Support vector machine (SVM) is a supervised learning technique with strong theoretical foundations based on the Vapnik–Chervonenkis theory.6 It has a strong regularization property which is the ability to generalize the model to new data. These characteristics help it overcome overfitting, which is a common issue for

1

Machine Learning Research Group, National ICT Australia Networks Research Group, National ICT Australia

2

Corresponding author: Nguyen LD Khoa, National ICT Australia, Level 5, 13 Garden Street, Eveleigh, NSW 2015, Australia. Email: [email protected]

Downloaded from shm.sagepub.com by guest on July 7, 2014

Khoa et al.

407

neural networks. Furthermore, SVM can unify different types of discriminant functions such as linear, non-linear and radial basic functions in the same framework. In practice, it is common that there are no damage data available for supervised learning. Unsupervised methods train the model without using class information, and the classification problem becomes the anomaly detection problem. Data objects which significantly deviate from the normal behaviour of the trained model are considered as anomalies or damage. One-class SVM proposed by Scho¨lkopf et al.7 is a robust technique for this purpose. The authors extended the idea of SVM to handle training data without class information. Moreover, the data in SHM often have high dimension with high redundancy and correlation. This can slow the detection process and make the underlying patterns more complicated to determine. Also, computation on high-dimensional data can be costly due to more data transmission and higher computing requirement of the sensor network. Principal component analysis (PCA) is a popular technique for dimensionality reduction in SHM.8–10 However, PCA involves the eigen decomposition of the data covariance matrix with Oðd 3 Þ complexity where d is the data dimension. This computation is expensive when d is extremely large, a common issue in SHM sensing data. Although we can reduce the computation by computing the first few eigenvectors, it is still not practical for extremely high dimensional data. Random projection (RP) is an alternative and less expensive method to reduce the dimensionality of extremely high dimensional data.11 Using RP, the dimension of the projected space only depends on the number of data points, no matter how high the original dimension of the data is. It is an effective and efficient dimensionality reduction method for high-dimensional data.12 This work is an effort which has applied SHM to the Sydney Harbour Bridge and will complement the current biennial bridge routine inspection. Machine learning approach has been used to detect damage in structures. It combines dimensionality reduction and pattern recognition techniques to accurately and efficiently distinguish faulty components from well-functioning ones. Specifically, RP is utilized for dimensionality reduction on the vibration feature data. Then, anomalies and failure patterns of bridge components are learned in the lower dimensional projected space using SVM and oneclass SVM. The learned models will be used to generate real-time health scores for bridge components. It avoids the time and cost of creating a numerical model, avoids the expensive computation on high-dimensional sensing data and provides the flexibility of model updating.

Related work Worden et al.13 used Mahalanobis distance to find anomalies in data, which are likely to result from damaged events. Data from damaged events are those whose Mahalanobis distances to training events are greater than a particular threshold. Data are assumed to follow the normal distribution which is not always hold in practice. Neural networks14 are popular techniques for damage detection in SHM. Zang and Imregun,15 Lee et al.16,17 and Zang et al.18 used neural networks to detect structural damage in a supervised setting, where data from both healthy and damaged states were required. Chan et al.19 used auto-associative neural networks, which were proposed by Kramer,20 for damage detection in several cable-supported bridges in Hong Kong including the Tsing Ma Bridge, Kap Shui Mun Bridge and Ting Kau Bridge. The networks were used as a novelty filter to detect the occurrence of damage in an unsupervised manner. An auto-associative network is a multilayer feedforward neural network with a bottleneck layer in the middle. The key feature is the patterns at the input layer are reproduced at the output layer. The bottleneck layer allows filtering redundant information and noise while retaining the essential information for the output layer. Then, a novelty index is defined as the difference between the target output and the output estimated from the network. In case the bridge experiences abnormal structural conditions, the novelty index is expected to significantly increase. However, overfitting is a major limitation of neural network. Many research works have been done which uses PCA for dimensionality reduction in SHM. Sohn et al.8 used PCA with the statistical process control for damage detection. Also, Worden and Manson9 applied linear and non-linear PCA for reducing the data dimension. Then, Mahalanobis distance was utilized to find anomaly in the reduced dimensional data. In addition, Zang and Imregun15 presented PCA to compress frequency response function (FRF) data for damage detection. However, the computation of PCA is expensive when the data dimension is extremely high and its performance is sensitive to the number of selected components. Zang et al.18 studied independent component analysis (ICA) to reduce the dimension in time domain signals, followed by neural networks for damage detection. ICA is a technique to decompose signals into uncorrelated independent components. Like PCA, the performance of this method is highly dependent on the number of selected components.

Downloaded from shm.sagepub.com by guest on July 7, 2014

408

Structural Health Monitoring 13(4)

Dimensionality reduction In this section, three methods for dimensionality reduction are presented. The data matrix is denoted as X 2 Rn 3 d , where n is the number of data instances and d is the data dimension. Since d is large, we aim to reduce the data to Y 2 Rn 3 k where k d.

the original data. Specifically, each element j of each row yi of Y will be computed as d

k yij ¼ d

kj X

xij

ð2Þ

l¼dk ðj1Þþ1

RP

PCA PCA is a vector space transformation from a higher dimensional space to a lower dimensional space.21 PCA involves a calculation of the eigenvectors of a data covariance matrix, after removing the mean of the data for each feature. It is a simple and widely used method for dimensionality reduction. PCA maps original features from data onto a new set of axes which are called the principal components of the transformation. Each principal component is equivalent to an eigenvector of the covariance matrix of the original dataset. Typically, the first few principal components account for most of the variance in the data so that the rest of principal components can be skipped with a minimum loss of information. That is, the purpose behind the use of PCA is to reduce data dimension. The transformation works as follows, suppose that each column of X has a zero mean, the covariance matrix of X is C ¼ ð1=ðn 1ÞÞX T X . A matrix V whose columns are d eigenvectors of the covariance matrix C forms a set of d principal components. It is computed from the decomposition C ¼ VDV T where D is the diagonal matrix whose ith diagonal element is the eigenvalue li . By convention, eigenvectors in V have unit lengths and are sorted by their eigenvalues, from large to small. Suppose that k is the number of the first eigenvectors accounting for most of the variance of the data (k\d), then P (d 3 k) is a matrix formed by the first k eigenvectors as the columns. A projection of a data matrix X onto a subspace spanned by eigenvectors in P Y ¼ XP

ð1Þ

represents the dimensionality reduction of X .

Piecewise aggregate approximation Keogh et al.22 proposed a technique called piecewise aggregate approximation (PAA) for dimensionality reduction of large time series data. This method is fast, simple and allows flexible distance measures for fast similarity search in large time series. The original signal (each row xi of X ) is divided into k equal-sized segments. The value for each element in the reduced data is the mean value of each segment in

PCA has the complexity of Oðd 3 Þ due to the eigen decomposition of the data covariance matrix. Even if we can reduce the computation by computing the first few eigenvectors, it is still not practical for extremely high dimensional data. RP is an alternative and less expensive way to reduce the dimensionality of very high dimensional data. According to Johnson–Lindenstrauss lemma, the pairwise Euclidean distances between data points are preserved if we randomly project the data onto a subspace spanned by Oðlog nÞ columns.23 Therefore, the dimension of the projected space only depends on the number of data points, no matter how high the original dimension of the data is. Achlioptas11 later showed that the RP can be applied using the following lemma. Lemma 1. Given X 2 Rn 3 d , e .0 and k ¼ Oðlog n=e2 Þ. Let Rd 3 k be a random matrix where each entry rij can be drawn from the following probability distribution 8 1 > þ1 with probability > > > 2s > < 1 rij ¼ 0 with probability 1 > s > > > > 1 : 1 with probability 2s

ð3Þ

where s represents the projection sparsity. With probability at least 1 1=n, the projection Y ¼ XR approximately preserves the pairwise Euclidean distances for all data points in X . More precisely ð1 eÞ k u vk2 k u0 v0 k2 ð1 þ eÞ k u vk2 ð4Þ

for all data points u; v 2 X and u0 ; v0 2 Y . The RP reduces the data dimension from d to k ¼ Oðlog nÞ. In practice, k can be small but still preserves the pairwise distances in the projected space.

Damage detection SVM SVM is a robust supervised learning technique with strong regularization property.6 A feature vector is denoted as x in a reduced dimensional space using dimensionality reduction techniques, y 2 f1; þ1g a

Downloaded from shm.sagepub.com by guest on July 7, 2014

Khoa et al.

409

label of x, where y ¼ 1 means that x is recorded from a damaged bridge component and y ¼ þ1 otherwise. We want to find a hyperplane with maximum margin that separates the points having y ¼ þ1 from those having y ¼ 1. The classification model is a function, f : Rd ! f1; þ1g. It is in the form f ðxÞ ¼ sgnðw x bÞ, where ‘’ is the dot product, sgnðxÞ ¼ þ1 if x . 0, sgnðxÞ ¼ 1 otherwise and w and b are parameters of the model and can be learned from a training process. Given a set of n training samples, fðxi ; yi Þgni¼1 , the training process determines the model parameters w and b by ensuring that the classification error of the obtained model on the training set is minimized while still maximizing the margin. Mathematically, the training process is equivalent to the following minimization problem n X 1 min k wk2 þ C ji w;j;b 2 i¼1

s:t: yi ðw xi bÞ 1 ji ;

ji 0;

i ¼ 1; . . . ; n ð5Þ

where ji and C are intermediate parameters in the training process; ji is a slack variable for controlling how much training error is allowed and C is a variable for controlling the balance between ji (the training error) and w (the margin). This minimization problem can be solved by Lagrangian multiplier and quadratic programming. The problem will be transformed to the dual form max

n X

a1 ;...;an

s:t:

i¼1

X

ai

1X ai aj yi yj xi xj 2 i;j

ai yi ¼ 0;

0 ai C;

One-class SVM We use one-class SVM7 as an unsupervised approach for anomaly detection. It is assumed that all positive examples share some common properties to form one class and negative examples can have very different properties without any commonness. It fits damage detection in SHM since there may exist many failure patterns and one-class SVM can detect all of them as anomalies. The algorithm used by Scho¨lkopf et al.7 finds a small region containing most of data points and the anomalies elsewhere. They do that by mapping data into a feature space using kernel and then separating them from the origin with maximum margin. Following the settings of supervised SVM learning, the unsupervised learning process can be regarded as the following optimization problem n 1 1X min k wk2 þ j r w;j;b 2 vn i¼1 i

s:t:

w xi r j i ;

ji 0;

ð7Þ i ¼ 1; . . . ; n

where v has a similar function to C for supervised SVM and controls the rate of anomalies in the data and n represents the number of training examples. It is worth noting that the training dataset fxi gni¼1 in this case only contains feature vectors and no label information is provided. Once the model is obtained, a health score can be generated Pin the same way as in supervised learning as f ðxÞ ¼ i ai xi xnew r.

Damage detection approach ð6Þ i; j ¼ 1; . . . ; n

i

Once P the classification model f ðxÞ ¼ sgnðw x bÞ ¼ sgnð i ai yi xi x bÞ is learned, a health score for a new vibration record, denoted as xnew , can be generated as P a y x x new b. i i i i

Non-linear classification The supervised algorithm described above is a linear classifier. To capture the non-linearity, kernel trick can be used by replacing the dot product by a non-linear kernel function.24 Kernel function is a function that corresponds to an inner product in the feature space. This makes the algorithm to fit the hyperplane in a transformed high-dimensional feature space. Some common kernels are linear, polynomial, radial basis function (RBF) and hyperbolic tangent.

This section describes our damage detection approach based on dimensionality reduction techniques and SVM. Figure 1 shows a machine learning flowchart, which has been used as a damage detection approach for our work on the Sydney Harbour Bridge. This approach can be generic and may be applied to other types of civil structures. First, excitation events are measured using accelerometers or other possible kinds of sensors, transferred and stored via a data acquisition system. A training set is selected and then features are extracted from the raw acceleration data in the time domain. Next, the feature dimension is reduced using dimensionality reduction techniques. After that, a model is trained using SVM or one-class SVM, depending on the availability of the data for the damaged states. When a new event arrives, feature extraction and dimensionality reduction steps are applied as in the training phase. Then, the event with reduced features is fed to the trained model, and it generates a structural

Downloaded from shm.sagepub.com by guest on July 7, 2014

410

Structural Health Monitoring 13(4)

Figure 1. Machine learning flowchart for damage detection.

health score, which is a SVM decision value described in sections ‘SVM’ and ‘One-class SVM’. A negative value indicates an abnormal condition of the structure. The absolute of the negative value of the health score presents the severity level of the damage. The more negative it is, the more severe the damage is. Since data are obtained from a known location (e.g. a joint in the Sydney Harbour Bridge), this approach can provide the information about the change in the state of the structure, the location as well as the severity of the damage.

Experimental results Experiments were conducted using two datasets collected from a laboratory-based building structure and from the Sydney Harbour Bridge. After preprocessing and extracting features from the raw data, three dimensionality reduction techniques (namely PCA, PAA and RP) described earlier in section ‘Dimensionality reduction’ were used in order to reduce the feature dimension. Then, damage was detected using supervised SVM and unsupervised one-class SVM in the reduced feature space.

Datasets, preprocessing and feature extraction Building data. A dataset called bookshelf was obtained from Los Alamos National Laboratory.25 The data are from a three-story building structure constructed of

Unistrut columns and aluminium floor plates. Dimensions of the structure and floor layout are presented in Figure 2. A shaker was used to generate the excitation. It was attached at corner D so that both translational and torsional motions can be excited. There were 24 piezoelectric single-axis accelerometers mounted on all joints of the structure. As it appears in Figure 2, two accelerometers were attached to each joint, resulting in eight accelerometers within each floor. There were 270 vibration events generated. Each event contained of 8192 points, which were sampled at 1600 Hz. Among those events, 150 healthy events were created using different shaker input levels and bandwidths to represent different environmental and operational conditions. In the remaining 120 events, there were 30 events with damage in location 1A (i.e. corner A at level 1), 60 events with damage in location 3C and 30 events with damage at both locations (i.e. 1A and 3C). The damage was introduced by loosening the bolts and then hand tightening them, or by removing bolts and brackets at the joints, allowing the plate to move freely relative to the column. For every vibration event, the data from each accelerometer were normalized by subtracting the mean and then dividing by the standard deviation. This makes all the events to have zero mean and one standard deviation. Then, the data were converted to frequency domain using Fourier transform. The difference

Downloaded from shm.sagepub.com by guest on July 7, 2014

Khoa et al.

411

Figure 2. Three-story building and floor layout (image is obtained from the data description25).

between two sensors mounted on one joint (in total there are 12 joints in three stories) in frequency domain was taken as a feature vector. Therefore, there were totally 3240 events obtained from all locations. Each of them had a feature vector of 8192 elements.

Bridge data. The Sydney Harbour Bridge is one of the major bridges in Australia, which was opened in 1932. As the bridge is ageing, it is critical to ensure it stays structurally healthy. There are 800 jack arches on the underside of the deck of the bus lane (lane 7) needed to be monitored, as shown in Figure 3(a). The arches are made of steel-reinforced concrete, therefore cracks may initiate along the joint of the arches due to the ageing of the structures and the heavy loading of buses on the deck. It is very critical to detect any deterioration in the state of the arches as soon as it occurs in order to schedule the required inspection and repair. Vibration data caused by passing vehicles were recorded by three-axis accelerometers installed under the deck of lane 7. Each joint was instrumented with a sensor node, which is connected to three accelerometers mounted on the joint in left, middle and right positions as shown in Figure 3(b). For this case study, only six instrumented joints were considered (named 1 to 6) as illustrated in Figure 4. The data were obtained in the period from early August until late October in 2012. A known crack existed in joint 4, while the other joints were in good conditions.

An event is defined as the time period during which a motor vehicle drives across an instrumented joint. An event is normally triggered after the acceleration value is greater than a pre-set threshold. After the triggering occurs, the node records for a period of 1.5 s at a sampling rate of 400 Hz. Each event contains 100 samples before the event was triggered, and 500 samples are collected during and after the event has occurred. An instantaneous acceleration at ith sample is denoted as Ai and the rest vector Ar is the average of three readings ðx; y; zÞ from the first 100 samples. There are two metrics extracted from three-axis readings: V1 ¼ jAi j jAr j and V2 ¼ jAi Ar j. They are independent of the accelerometer orientations and the magnitude of the rest vectors. Figure 5(a) and (b) show plots of normalized vectors V1 obtained from the middle accelerometers of the healthy joint number 3 and damaged joint number 4, respectively. Note that not all the events had the same wave form shown in Figure 5. Some were triggered within the first 100 samples. Some had many peaks due to the fact that vehicles were close to each other when passing by the joints. These kinds of events have irregular patterns and can distort the learner to recognize the correct patterns for healthy and damaged events. Therefore, all those events were filtered out by detecting and counting the number of peaks within the signals. Signals with no peak greater than 10, with peaks in the first 100 samples and with more than four peaks were removed. Finally, 6370 events were used for the analysis.

Downloaded from shm.sagepub.com by guest on July 7, 2014

412

Structural Health Monitoring 13(4)

Figure 3. Sensor nodes installed on the Sydney Harbour Bridge: (a) lane 7 – the second lane from left and (b) sensor node in a joint with three accelerometers.

Figure 4. Schematic of the evaluated joints with cracking.

Figure 5. Normalized vibration signals (V1) of (a) healthy and (b) damaged events selected from the bridge data.

Downloaded from shm.sagepub.com by guest on July 7, 2014

Khoa et al.

413

Figure 6. RP of building dataset: (a) computational time and (b) accuracy. RP: random projection.

Similar to the feature extraction for the building data, the data of every vibration event from each accelerometer were normalized to have zero mean and one standard deviation and then were converted to the frequency domain using Fourier transform. For each event, the difference between V1 in frequency domain of each accelerometer pair from three accelerometers was taken. Since there are three pairs of accelerometers for each joint, there were 600 3 3 ¼ 1800 feature elements. The same procedure was done for V2 . Totally, the feature vector of each event from each joint had 3600 elements. Intuitively, it is expected that if the joint is healthy, the three accelerometers would move together. If the joint is damaged (e.g. there is a crack), the three accelerometers would move differently. These features will be reflected in the differences of the signals. Also, by taking the difference, we can eliminate the effect of operational and environmental conditions on the vibration signals. One of the main challenges in SHM is to extract features which are sensitive to the damage but not to operational and environmental conditions.1 For supervised learning and the evaluation of the results, events from joint 4 were labelled as damaged and events from all the other joints were labelled as healthy.

Effectiveness of RP Experiments were conducted to validate the effectiveness of RP in maintaining the pairwise distances. Figures 6 and 7 show the computational time required for computing the average pairwise Euclidean distances of all data points in the two datasets described in

section ‘Datasets, preprocessing and feature extraction’ for different values of k and the corresponding errors between the average pairwise distances in the original data space and in the projected space. For the building dataset, k was selected as ð2; 4; 8; 16; 32; 64; 128; 256; 512; 1024Þ, while for the bridge dataset k was chosen to be ð2; 5; 10; 20; 40; 50; 100; 200; 400; 600Þ. Sparsity value s in Lemma 1 was set to 1 in the experiment. The value noted in Figures 6(a) and 7(a) is the computational time in the original space. It is considerably slower than that of using RP. The results show that RP can significantly reduce the computational time while maintaining the accuracy level in the distance values. For k as small as 8 and 10 in the building and bridge datasets, respectively, the errors were less than 5%, while the computational time was several times faster compared to that without using RP (103 and 49 times faster, respectively). The effectiveness of RP in maintaining pairwise distances in the reduced dimensional space appears in Lemma 1, where the bound of the approximation is guaranteed. Furthermore, the results show that k can be quite small since the error curve slightly decreases when k reaches a certain value.

Effectiveness of dimensionality reduction and SVM In this experiment, the approach described in section ‘Damage detection approach’ was used for dimensionality reduction and damage detection. Three different dimensionality reduction techniques described in section ‘Dimensionality reduction’ (namely, PCA, PAA, and RP) were used. Then, SVM and one-class SVM were used as supervised and unsupervised techniques

Downloaded from shm.sagepub.com by guest on July 7, 2014

414

Structural Health Monitoring 13(4)

Figure 7. RP of bridge dataset: (a) computational time and (b) accuracy. RP: random projection.

Figure 8. (a) Computational time and (b) detection accuracy of supervised SVM of building data. RP: random projection; SVM: support vector machine; PCA: principal component analysis; PAA: piecewise aggregate approximation.

for damage detection. Both of them used RBF kernel kðu; vÞ ¼ expðg k u vk2 Þ. Test events which have positive decision values indicate a healthy status and negative ones represent a damaged condition of the structure. For simplicity, the names of the dimensionality reduction techniques were used as the combination of these methods and SVM. The performance in terms of computational time and detection accuracy was reported and compared. Different values of k were used, which is similar to the previous section. All the results reported were the averages of fivefold cross-validation. Accuracy is in the range between 0 and 1 which is a ratio between the

number of test events that were predicted correctly and the total number of test events.

Supervised SVM. Figures 8 and 9 show the computational time and detection accuracy using the supervised SVM in the original feature space and in the reduced dimensional spaces based on the three methods for both datasets. The numbers noted in the figures are the time and detection accuracy of the method without using dimensionality reduction techniques. SVM combined with RP achieved the fastest time in most of the cases. On

Downloaded from shm.sagepub.com by guest on July 7, 2014

Khoa et al.

415

Figure 9. (a) Computational time and (b) detection accuracy of supervised SVM of bridge data. RP: random projection; SVM: support vector machine; PCA: principal component analysis; PAA: piecewise aggregate approximation.

Figure 10. (a) Computational time and (b) detection accuracy of unsupervised SVM of building data. RP: random projection; SVM: support vector machine; PCA: principal component analysis; PAA: piecewise aggregate approximation.

the other hand, PCA was the slowest among the three methods. The use of RP and PCA resulted in very high accuracies even with a small value of k for both datasets. For the building data, PAA required a larger number of reduced dimension to be able to maintain the accuracy level of the method in the original space. Also, Figure 8(b) shows that the detection results of the reduced dimensional space can be improved compared to those of the original space. This is reasonable result since the original data may contain noise which was reduced when the dimensionality reduction techniques were applied. For the building data with k ¼ 32, RP

method achieved an accuracy level of 0.97 (it was 0.93 without using dimensionality reduction) and it was 207 times faster. For the bridge data with k ¼ 100, RP can obtain the similar level of accuracy with 36 times faster.

Unsupervised SVM. Figures 10 and 11 show the computational time and detection accuracy of unsupervised SVM (one-class SVM) in the original feature space and in the reduced dimensional spaces computed using the three methods for both datasets. SVM combined with RP and PAA had similar running times and was the fastest. PCA was the slowest method among the three.

Downloaded from shm.sagepub.com by guest on July 7, 2014

416

Structural Health Monitoring 13(4)

Figure 11. (a) Computational time and (b) detection accuracy of unsupervised SVM of bridge data. RP: random projection; SVM: support vector machine; PCA: principal component analysis; PAA: piecewise aggregate approximation.

The detection accuracies were lower than those using the supervised SVM, but it is reasonable for unsupervised learning since it did not use any label information in the training. Similar to the supervised case, RP and PCA achieved high accuracies with a small value of k. PAA required a larger value of k to be able to maintain a similar accuracy as for the original space. For the building data with k ¼ 32, RP method achieved an accuracy of 0.71 (it was 0.70 without using dimensionality reduction) while it was 175 times faster. For the bridge data with k ¼ 100, RP can obtain similar accuracy with 29 times faster. Overall, the experimental results of supervised and unsupervised learning show that RP achieved a balance between reducing computational time and maintaining detection accuracy. It significantly reduces the computational time while maintaining the detection accuracy. Using PCA, similar level of accuracy was achieved but the running time was longer. PAA, on the other hand, had a comparable running time to RP but with lower detection accuracy. Moreover, RP is theoretically supported by Lemma 1 in maintaining the pairwise distance. Followed by the fact that SVM utilizes the distance metric in its RBF kernel, which was used in this study, it explains the effectiveness of the combination of RP and SVM.

high-dimensional sensor vibration data of civil structures. It was found that the combined method of RP with SVM considerably reduced the computational time and maintained the detection accuracy. This approach combined the strong regularization of SVM in classification with the efficiency and effectiveness of RP in computing the reduced dimensional space and maintaining the pairwise distance. Experimental results using data from a laboratory-based building structure and the Sydney Harbour Bridge showed that the combined method of RP with SVM outperformed popular dimensionality reduction techniques using PCA and PAA. Using RP with SVM, a computational time of more than 200 times faster than that of the method without using dimensionality reduction was achieved while still obtaining similar detection accuracies. Acknowledgements The authors would like to thank Mr Peter Runcie from NICTA for his valuable supports and suggestions to the work.

Declaration of conflicting interests The authors declare that there is no conflict of interest.

Funding

Conclusion

This research received no specific grant from any funding agency in the public, commercial or not-for-profit sectors.

This work presented a robust dimensionality reduction and damage detection methodology using a machine learning approach. Both supervised and unsupervised methods using SVM and one-class SVM were studied for damage detection. RP, PCA and PAA were used as different methods for dimensionality reduction for

References 1. Farrar CR and Worden K. An introduction to structural health monitoring. Philos T R Soc A 2007; 365(1851): 303–315.

Downloaded from shm.sagepub.com by guest on July 7, 2014

Khoa et al.

417

2. Sohn H, Farrar CR, Hemez FM, et al. A review of structural health monitoring literature: 1996–2001. Los Alamos, NM: Los Alamos National Laboratory, 2004. 3. Worden K, Farrar CR, Manson G, et al. The fundamental axioms of structural health monitoring. P R Soc A 2007; 463(2082): 1639–1664. 4. Worden K and Manson G. The application of machine learning to structural health monitoring. Philos T R Soc A 2007; 365(1851): 515–537. 5. Doebling SW, Farrar CR, Prime MB, et al. Damage identification and health monitoring of structural and mechanical systems from changes in their vibration characteristics: a literature review. Los Alamos, NM: Los Alamos National Laboratory, 1996. 6. Cortes C and Vapnik V. Support-vector networks. Mach Learn 1995; 20(3): 273–297. 7. Scho¨lkopf B, Platt JC, Shawe-Taylor JC, et al. Estimating the support of a high-dimensional distribution. Neural Comput 2001; 13(7): 1443–1471. 8. Sohn H, Czarnecki J and Farrar C. Structural health monitoring using statistical process control. J Struct Eng: ASCE 2000; 126(11): 1356–1363. 9. Worden K and Manson G. Visualization and dimension reduction of high-dimensional data for damage detection. Proc Soc Photo Opt Instrum Eng 1999; 3727: 1576. 10. Mujica LE, Veh J, Ruiz M, et al. Multivariate statistics process control for dimensionality reduction in structural assessment. Mech Syst Signal Pr 2008; 22(1): 155–171. 11. Achlioptas D. Database-friendly random projections. In: Proceedings of the 20th ACM SIGMOD-SIGACTSIGART symposium on principles of database systems, PODS ’01, Santa Barbara, CA, 21–24 May 2001, pp. 274–281. New York: ACM. 12. Bingham E and Mannila H. Random projection in dimensionality reduction: applications to image and text data. In: Proceedings of the 7th ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’01, San Francisco, CA, 26–29 August 2001, pp. 245–250. New York: ACM. 13. Worden K, Manson G and Fieller N. Damage detection using outlier analysis. J Sound Vib 2000; 229(3): 647–667.

14. Bishop CM. Neural networks for pattern recognition. New York: Oxford University Press, Inc., 1995. 15. Zang C and Imregun M. Structural damage detection using artificial neural networks and measured FRF data reduced via principal component projection. J Sound Vib 2001; 242(5): 813–827. 16. Lee JW, Kim JD, Yun CB, et al. Health-monitoring method for bridges under ordinary traffic loadings. J Sound Vib 2002; 257(2): 247–264. 17. Lee JJ, Lee JW, Yi JH, et al. Neural networks-based damage detection for bridges considering errors in baseline finite element models. J Sound Vib 2005; 280(3–5): 555–578. 18. Zang C, Friswell MI and Imregun M. Structural damage detection using independent component analysis. Struct Health Monit 2004; 3(1): 69–83. 19. Chan THT, Ni YQ and Ko JM. Neural network novelty filtering for anomaly detection. In: Proceedings of the 2nd international workshop on structural health monitoring (ed FK Cheng), Stanford, CA, 8–10 September 1999, pp. 133–137. Stanford, CA: Technomic Pub. Co. 20. Kramer MA. Nonlinear principal component analysis using autoassociative neural networks. Am Inst Chem Eng J 1991; 37(2): 233–243. 21. Jolliffe IT. Principal component analysis. 2nd ed. New York: Springer, 2002. 22. Keogh E, Chakrabarti K, Pazzani M, et al. Dimensionality reduction for fast similarity search in large time series databases. Knowl Inf Syst 2001; 3(3): 263–286. 23. Johnson W and Lindenstrauss J. Extensions of Lipschitz mappings into a Hilbert space. Contemp Math 1984; 26: 189–206. 24. Boser BE, Guyon IM and Vapnik VN. A training algorithm for optimal margin classifiers. In: Proceedings of the 5th annual workshop on computational learning theory, COLT ’92, Pittsburgh, PA, 27–29 July 1992, pp. 144–152. New York: ACM. 25. Los Alamos National Laboratory, http://institute.lanl. gov/ei/software-and-data/ (accessed 1 June 2013).

Downloaded from shm.sagepub.com by guest on July 7, 2014

Structural Health Monitoring: A Machine Learning ...

structural health monitoring via stiffness update

56.PERSONAL HEALTH MONITORING WITH ANDROID BASED ...

Personal health monitoring with Android based mobile devices.pdf ...

Appendix Monitoring Health Concerns Related to Marijuana in ...

Mining Health Models for Performance Monitoring of ...

impact of electronic hand hygiene monitoring systems - Halyard Health

Health and wellness monitoring through wearable and ...

forest health monitoring protocol applied to roadside ...

Personal health monitoring with Android based mobile devices.pdf ...

Monitoring Health Concerns Related to Marijuana in Colorado 2014 ...

A generic probabilistic framework for structural health ...

SERVER MONITORING SYSTEM

Lead_DC_Env_Exposure_Detection-Monitoring-Investigation-of ...

Structural dynamics.pdf

Structural Geology.pdf

Functional, Structural and Non-Structural Preparedness ...