Deep neural networks A promising tool for fault

Deep neural networks:A promising tool for fault characteristic

mining and intelligent diagnosis of rotating machinery

with massive data

Feng Jia,Yaguo Lei n,Jing Lin,Xin Zhou,Na Lu

State Key Laboratory for Manufacturing Systems Engineering,Xi’an Jiaotong University,No.28Xianning West Road,Xi'an710049,China

a r t i c l e i n f o

Article history:

Received28January2015

Received in revised form

5September2015

Accepted26October2015

Available online18November2015

Keywords:

Deep learning

Deep neural networks

Intelligent fault diagnosis

Rotating machinery

Massive data

diagnosis results,

numerous studies have

been conducted on intelligent fault diagnosis of

rotating machinery.Among these studies,the methods based on artiﬁcial neural networks

(ANNs)are commonly used,which employ signal processing techniques for extracting

features and further input the features to ANNs for classifying faults.Though these

methods did work in intelligent machinery,they still have two

deﬁciencies.(1)The features on much prior knowledge

about signal processing In addition,these manual

are to a speciﬁc diagnosis for

issues.(2)adopted in these methods have which

the capacity to learn the complex diag-

nosis issues.As a breakthrough in artiﬁcial intelligence,deep learning holds the potential

to overcome the aforementioned deﬁciencies.Through deep learning,deep neural net-

works(DNNs)with deep instead of shallow ones,could be established to

mine the useful information data and approximate complex non-linear func-

tions.Based on DNNs,a novel method is proposed in this paper to overcome

the deﬁciencies of the aforementioned intelligent diagnosis methods.The effectiveness of

the proposed method is validated using datasets from rolling element bearings and pla-

netary gearboxes.These datasets contain massive measured signals involving different

health conditions under various operating conditions.The diagnosis results show that the

proposed method is able to not only adaptively mine available fault characteristics from

the measured also obtain superior diagnosis accuracy compared with the

existing methods.

1.Introduction

In order to fully health conditions of rotating machinery,condition monitoring systems are used to collect real-time data and therefore massive data are acquired after long time operation of the machines[1].As the data is generally collected faster than diagnosticians can analyze it[2],there is an urgent need for diagnosis methods that

Contents lists available at ScienceDirect

journal homepage:www.elsevier/locate/ymssp

Mechanical Systems and Signal Processing

/10.ssp.2015.10.025

n Corresponding author.

E-mail address:yaguolei@mail.xjtu.edu(Y.Lei).

Mechanical Systems and Signal Processing72-73(2016)303–315

can effectively analyze massive data and automatically provide accurate diagnosis results.This kind of methods is called intelligent fault diagnosis methods,in which arti ﬁcial intelligence techniques,such as arti ﬁcial neural networks (ANNs),support vector machine (SVM),fuzzy inference,etc.,are used for distinguishing machinery health conditions [3–5].Based on the results produced by the intelligent diagnosis methods,it is possible to take appropriate maintenance actions and ensure healthy operation of the machines [6].Correspondingly,intelligent fault diagnosis methods have been widely investigated and applied in the ﬁeld of fault diagnosis of rotating machinery [7].Samanta [8]extracted time-domain features and employed three optimized neural networks to detect pump faults.In addition,Samanta et al.[9]utilized time-domain features to characterize the bearing health conditions and employed ANNs and SVM to diagnose faults of bearings.Statistical features were extracted by Tran et al.[10]for representing the health conditions of induction motor and then decision tree and adaptive neuro-fuzzy inference system (ANFIS)were utilized for distinguishing the faults.Moreover,Tran et al.[11]calculated features from thermal imaging based on bi-dimensional empirical mode decomposition,and then input selected features into relevance vector machine (RVM)for fault classi ﬁcation.Two features were proposed by Lei et al.[12]to characterize health conditions of planetary gearboxes and ANFIS was applied to recognize these health conditions.Widodo et al.[13]calculated statistical features from the measured signals and carried out RVM and SVM to diagnose the bearing faults.Lai et al.[14]introduced

cumulants as input features and used radial basis function network as the fault classi ﬁer.A method was presented by Bin et al.[15],utilizing wavelet packets-empirical mode decomposition for feature extraction and multi-layer perceptron network for fault classi ﬁcation.

Through the literature review,we notice that ANNs are one of the most commonly used classi ﬁers in the intelligent fault diagnosis methods,which generally include two main fault feature extraction using signal processing techniques and fault classi ﬁcation using ANN classi ﬁers.Feature extraction involves mapping of measured signals onto representative features characterizing the health conditions of machinery.And fault classi ﬁcation is to distinguish the health conditions based on the extracted features.Thanks to the representative features from the measured signals and adaptive learning capability of ANNs,the ANN-based methods are supposed to displace diagnosticians for making decisions and work well in intelligent fault diagnosis [7].The ANN-based methods reported in literature,however,have two obvious de ﬁciencies:

(1)The features input into classi ﬁers are extracted and selected by diagnosticians from the measured signals,largely depending on prior knowledge about signal processing techniques and diagnostic expertise.In addition,the features are selected according to a speci ﬁc diagnosis issue and probably unsuitable for other issues.Thus it is necessary to adaptively mine the characteristics hidden in the me

asured signals to re ﬂect the different health conditions of machinery,instead of extracting and selecting features manually.(2)The ANNs commonly adopted in intelligent fault diagnosis of rotating machinery have shallow architectures,which means that only one hidden layer is included in an ANN architecture,like the ANNs in Refs.[8,9,14,15].Such simple architectures limit the capacity of ANNs to learn the complex non-linear relationships in fault diagnosis issues.Thus it is necessary to establish a deep architecture network for distinguishing the health con-ditions of machinery.

Deep learning [16]holds the potential to overcome the aforementioned de ﬁciencies in current intelligent diagnosis methods.It refers to a class of machine learning techniques,where many layers of information processing stages in deep architectures are exploited for pattern classi ﬁcation and other tasks [17].Using deep learning,deep neural networks (DNNs)with deep architectures can be established.Due to the deep architectures,DNNs are able to adaptively capture the representative information from raw data through multiple non-linear transformations and approximate complex non-linear functions with a small error.Since the idea of deep learning appeared in Science ,it has attracted lot of attention from researchers in different ﬁelds [18].Dahl et al.[19]proposed a pre-trained deep neural network hidden Markov model for large-vocabulary speech recognition and obtained an accuracy improvement compared with traditional models.Krizhevsky et al.[20]developed a DNN-based

method in large scale visual recognition challenge involving millions of labeled images,and got the best result.Deep learning methods were utilized by Baldi et al.[21]to search for exotic particles in high-energy physics and the results demonstrated that the methods can improve the searching ability of collider.The aforementioned applications prove that deep learning is a promising tool in dealing with massive data.But it attracts few attentions in the ﬁeld of fault diagnosis.Based on Teager –Kaiser energy operator and deep belief network trained by deep learning,Tran et al.[22]proposed a new method for diagnosing faults of reciprocating compressor valves.In this method,they treated deep belief network as a classi ﬁer and still manually extracted features to input the classi ﬁer,which ignored the ability of the network in mining fault characteristics.

Based on DNNs trained through deep learning,this paper proposes a novel intelligent diagnosis method to overcome the two de ﬁciencies of the ANN-based methods in fault diagnosis of rotating machinery.In this method,DNNs are utilized to implement both fault feature extraction and intelligent diagnosis.The DNNs are ﬁrst pre-trained by an unsupervised layer-by-layer learning and then ﬁne-tuned with a supervised algorithm,where the unsupervised process helps the fault characteristic mining and the supervised process contributes to construct the discriminative fault characteristics for classi ﬁcation [23].The merits of the pro-posed method are summarized as follows.(1)It is able to adapt

ively mine fault characteristics from the measured signals for various diagnosis issues.(2)The method is good at establishing the non-linear mapping relationship between the different health conditions of machinery and the corresponding measured signals.Therefore,the proposed method is expected to obtain higher diagnosis accuracy compared with the methods based on shallow ANNs.The rest of this paper is organized as follows.Section 2brie ﬂy introduces the theoretical background of DNNs.Section 3is dedicated to a description of the proposed intelligent diagnosis method.In Section 4,the effectiveness of the proposed method is validated using four rolling element bearing datasets and a planetary gearbox dataset.The bearing datasets contain thousands of signals with different fault categories and severities under various operating loads.And the gearbox dataset includes tens of thousands of signals with different fault modes and locations

F.Jia et al./Mechanical Systems and Signal Processing 72-73(2016)303–315

304

under various operating conditions,like different rotating speeds and loads.In addition,the proposed method is compared with several intelligent methods using the same bearing datasets in this section.Conclusions are drawn in Section 5.

2.A brief introduction to DNNs

DNNs have deep architectures containing multiple hidden layers and each hidden layer conducts a non-linear trans-formation from the previous layer to next one [18,24].Through deep learning addressed by Hinton et al.[16],DNNs are trained according to the following two main procedures:(1)Pre-train the DNNs layer by layer with unsupervised techni-ques,like autoencoders.(2)Further ﬁne-tune the DNNs with back propagation (BP)algorithm for classi ﬁcation.

2.1.Autoencoders

An autoencoder is one type of unsupervised neural networks with three layers [24,25]and the output target of the autoencoder is the input data.As depicted in Fig.1,the autoencoder comprises two ,encoder network and decoder network.The encoder network transforms the input data from a high-dimensional space into codes in a low-dimensional space and the decoder network reconstructs the inputs from the corresponding codes.

The encoder network is explicitly de ﬁned as an encoding function denoted by f θ[24].This function is called the encoder.For each measured signal x m from a dataset f x m g M m ¼1of rotating machinery,we de ﬁne

h m ¼f θðx m Þ

ð1Þwhere h m is the encode vector obtained from x m .The decoder network is de ﬁned as a reconstruction function denoted by g θ0,namely the decoder.It maps h m from the low-dimensional space back into the high-dimensional space,producing a reconstruction

^x m ¼g θ0ðh m Þð2ÞThe parameter sets of the encoder and decoder are learned simultaneously on the task of reconstructing as well as

possible the original input,attempting to incur the lowest possible reconstruction error L ðx ;^x

Þover the M training examples,where L ðx ;^x Þis a loss function that measures the discrepancy between x and ^x [24].In summary,the autoencoder training aims to ﬁnd the parameter sets Q and θ0minimizing reconstruction error:

ϕAE ðθ;θ0Þ¼1M X M m ¼1L ðx m ;g θ0ðf θðx m ÞÞÞð3Þ

The commonly used forms for the encoder and decoder are af ﬁne mappings [26],optionally followed by a non-linearity:

f θðx Þ¼s f ðWx þb Þ

ð4Þg θ0ðx Þ¼s g ðW T x þd Þ

ð5ÞL ðx ;^x Þ¼‖x À^x ‖2ð6Þ

where s f and s g are the encoder and decoder activation functions,respectively.Thus,the parameter sets of the autoencoder are θ¼f W ;b g and θ0¼f W T ;d g ,where b and d are bias vectors,and W and W T are the weight matrices.

Decoder Encoder

Input data Input data reconstruction

m x

2ˆm n x m x 2m x 1m n x

ˆm x

1ˆFig.1.Architectural graph of an autoencoder.

F.Jia et al./Mechanical Systems and Signal Processing 72-73(2016)303–315305

2.2.Pre-training andﬁne-tuning

N autoencoders could be stacked to pre-train an N-hidden-layer DNN.When given input signal x m,the input layer and theﬁrst hidden layer of the DNN are regarded as the encoder network of theﬁrst autoencoder.After theﬁrst autoencoder is trained through minimizing the reconstruction error in Eq.(3),the trained parameter setθ1of the encoder network is used to initialize theﬁrst hidden layer of the DNN.And theﬁrst encode vector h m

of the x m is calculated as follows:

h m 1¼fθ

ðx mÞð7Þ

Then the encode vector h m

is the input data,theﬁrst hidden layer and the second hidden layer of the DNN are regarded

as the encoder network of the second autoencoder.Correspondingly,the second hidden layer of the DNN is initialized by the second trained autoencoder.The process is conducted in the sequence until the N th autoencoder is trained for initializing

theﬁnal hidden layer of the DNN.And the N th encode vector h m

of the x m is calculated as

h m N¼fθ

N ðh m

NÀ1

Þð8Þ

whereθN is the parameter set of the N th autoencoder.

In this way,through training N stacked autoencoders,all the hidden layers of the DNN are pre-trained.This pre-training process is proven to yield signiﬁcantly better local minima than random initialization of the DNN and helps achieve better generalization in classiﬁcation tasks[26,27],as well as in fault diagnosis of rotating machinery.

After the DNN is pre-trained,ﬁne-tuning process is utilized in next step of the DNN training.The output layer of the DNN is employed to contain the output targets for classiﬁcation tasks.The output of the DNN calculated from the input signal x m is

y m¼fθ

Nþ1ðh m

Þð9Þ

Fig.2.Flowchart of the proposed method.

F.Jia et al./Mechanical Systems and Signal Processing72-73(2016)303–315

306

where θN þ1is the parameter set of output layer.In order to approximate the output target properly,BP algorithm is utilized to minimize the error of the output by adjusting the parameters in the DNN backwards [28].Supposing that the output

target of the x m is d m ,the error criterion is described as

ϕDNN ðΘÞ¼1X m L ðy m ;d m Þð10Þ

where Θ¼f θ1;θ2;⋯;θN þ1g .The parameter set Θcan be updated as follows.

Θ¼ΘÀη∂ϕDNN

ðΘÞ∂Θð11Þ

where ηis the learning rate of the ﬁne-tuning process,which is introduced to guarantee a convergence i

n the update procedure [29].

3.DNN-based intelligent diagnosis method

Based on DNNs,this study proposes a novel intelligent fault diagnosis method that adaptively mines the fault char-acteristics from raw signals of rotating machinery and automatically classi ﬁes machinery health conditions with these fault characteristics.The raw signals refer to the measured signals in the fr

equency ,frequency spectra.And the main reason of using frequency spectra is that the frequency spectra of rotating machinery show how their constitutive com-ponents are distributed with discrete frequencies and may provide clear information about the health conditions of rotating machinery [30].

As shown in Fig.2,the proposed method includes the following four procedures:(1)Obtain the frequency spectra of rotating machinery under different health conditions.These spectra comprise the training set x i ;d i n o M i ¼1

,where x i is the i th frequency spectrum for training,d i is the health condition label of the x i

and M is the number of the frequency spectra.(2)Build a DNN with multiple hidden layers,in which the number of the input units is the dimension of the frequency spectrum x i .Then utilize the unlabeled training set x ¼x i ÈÉM i ¼1to pre-train the DNN layer by layer with a stack of autoencoders,where the number of autoencoders refers to the number of hidden layers inside the DNN.The process is speci ﬁcally displayed in Fig.3.Firstly,regard ﬁrst hidden layer of the DNN as hidden layer of the ﬁrst autoencoder and utilize the unlabeled training set x as input data and output target to train the ﬁrst autoencoder,as shown in Fig.3(a).The trained parameters f W 1;b 1g of the autoencoder are used to in

itialize the parameters of the ﬁrst hidden layer of the DNN,and h 1is the encode vector computed from frequency spectra of rotating machinery by the ﬁrst autoencoder.Then,use h 1as the inputs and outputs to train the second autoencoder for initializing parameters of the second hidden layer of the DNN,and obtain h 2in Fig.3(b).Finally,continue the training steps in the sequence until the N th autoencoder is trained and the frequency spectra are coded into h N in Fig.3(c).In this way,all of the hidden layers of the DNN are pre-trained.(3)Determine the dimension of the output layer according to the number of the machinery health conditions.And implement the BP algorithm to ﬁne-tune the parameters of the DNN through minimizing the error between the Inputs HL 1

HL 2

HL N

HL N -1Outputs The second step: N th step:

The first step:

}

−N h N h Fig.3.Diagram of illustrating the pre-training process (HL is short for hidden layer):(a)train the ﬁrst autoencoder of the DNN,(b)train the second autoencoder and (c)train the N th autoencoder.F.Jia et al./Mechanical Systems and Signal Processing 72-73(2016)303–315307

本文发布于:2024-09-20 16:38:12，感谢您对本站的认可！

本文链接：https://www.17tex.com/xueshu/816639.html

上一篇：di前缀的单词

下一篇：(完整版)人教版高中英语单词表(含音标)

标签：

留言与评论（共有 0 条评论）