<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE article  PUBLIC "-//NLM//DTD Journal Publishing DTD v3.0 20080202//EN" "http://dtd.nlm.nih.gov/publishing/3.0/journalpublishing3.dtd"><article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" dtd-version="3.0" xml:lang="en" article-type="research article"><front><journal-meta><journal-id journal-id-type="publisher-id">ENG</journal-id><journal-title-group><journal-title>Engineering</journal-title></journal-title-group><issn pub-type="epub">1947-3931</issn><publisher><publisher-name>Scientific Research Publishing</publisher-name></publisher></journal-meta><article-meta><article-id pub-id-type="doi">10.4236/eng.2013.51007</article-id><article-id pub-id-type="publisher-id">ENG-26533</article-id><article-categories><subj-group subj-group-type="heading"><subject>Articles</subject></subj-group><subj-group subj-group-type="Discipline-v2"><subject>Engineering</subject></subj-group></article-categories><title-group><article-title>
 
 
  An Improved Kernel K-Mean Cluster Method and Its Application in Fault Diagnosis of Roller Bearing
 
</article-title></title-group><contrib-group><contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>ing-Li</surname><given-names>Jiang</given-names></name><xref ref-type="aff" rid="aff1"><sup>1</sup></xref></contrib><contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Yu-Xiang</surname><given-names>Cao</given-names></name><xref ref-type="aff" rid="aff2"><sup>2</sup></xref></contrib><contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Hua-Kui</surname><given-names>Yin</given-names></name><xref ref-type="aff" rid="aff2"><sup>2</sup></xref></contrib><contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Kong-Shu</surname><given-names>Deng</given-names></name><xref ref-type="aff" rid="aff1"><sup>1</sup></xref></contrib></contrib-group><aff id="aff1"><addr-line>Engineering Research Center of Advanced Mining Equipment, Ministry of Education,
Hunan University of Science and Technology, Xiangtan, China</addr-line></aff><aff id="aff2"><addr-line>Hunan Provincial Key Laboratory of Health Maintenance for Mechanical Equipment,
Hunan University of Science and Technology, Xiangtan, China</addr-line></aff><pub-date pub-type="epub"><day>24</day><month>01</month><year>2013</year></pub-date><volume>05</volume><issue>01</issue><fpage>44</fpage><lpage>49</lpage><history><date date-type="received"><day>November</day>	<month>12,</month>	<year>2012</year></date><date date-type="rev-recd"><day>December</day>	<month>11,</month>	<year>2012</year>	</date><date date-type="accepted"><day>December</day>	<month>20,</month>	<year>2012</year></date></history><permissions><copyright-statement>&#169; Copyright  2014 by authors and Scientific Research Publishing Inc. </copyright-statement><copyright-year>2014</copyright-year><license><license-p>This work is licensed under the Creative Commons Attribution International License (CC BY). http://creativecommons.org/licenses/by/4.0/</license-p></license></permissions><abstract><p>
 
 
   For the kernel K-mean cluster method is run in an implicit feature space, the initial and iterative cluster centers cannot be defined explicitly. Against the deficiency of the initial cluster centers selected in the original space discretionarily in the existing methods, this paper proposes a new method for ensuring the clustering center that virtual clustering centers are defined in the feature space by the original classification as the initial cluster centers and the iteration clustering centers are ensured by the further virtual classification. The improved method is used for fault diagnosis of roller bearing that achieves a good cluster and diagnosis result, which demonstrates the effectiveness of the proposed method.
    <!--?xml:namespace prefix = o /-->
     
 
</p></abstract><kwd-group><kwd>Improved Kernel K-Mean Cluster; Fault Diagnosis; Roller Bearing</kwd></kwd-group></article-meta></front><body><sec id="s1"><title>1. Introduction</title><p>The kernel K-mean cluster method is a combination of kernel method and K-mean method that the original items are embedded into a vector space called the feature space fist, and then K-mean cluster is performed in the feature space [1-2]. The initial cluster centers have an important influence on the cluster result. For example, the cluster result would be different by choosing different initial clusters, and there would obtain a local optimum rather than global optimal cluster result if choosing improper initial cluster center [3,4]. For the kernel K-mean cluster method is run in an implicit feature space, the initial and iterative cluster centers cannot be defined explicitly. To address this issue, L. Zhang [<xref ref-type="bibr" rid="scirp.26533-ref5">5</xref>] proposed the kernel clustering algorithm, in which the initial cluster centers are given freely in the original space and the iteration clustering centers are changed through transforming the kernel matrix to realize the algorithm iterative. This process is complicated and difficult to understand. R. Kong [<xref ref-type="bibr" rid="scirp.26533-ref6">6</xref>] adopted the kernel K-mean clustering, in which the initial cluster centers are arbitrarily given in the original space, and the iteration clustering centers are obtained according to the previous classification results, which choose every sample of every category as a clustering center respectively, to calculate the intra-class distance between the other sample points and the clustering center, the iteration cluster centers are the point with the minimum sum of distances. The paper combines the two perspectives and puts forward a new kernel K-mean clustering algorithm execution method, which uses virtual initial classification to determine the initial kernel clustering centers, and the iterative classification to determine the new virtual kernel clustering centers to ultimately achieve accurate clustering. The proposed method is applied in the diagnosis of rolling bearing clustering to verify the effectiveness of the method.</p></sec><sec id="s2"><title>2. Kernel K-Mean Clustering Method</title><p>The K-mean clustering is an unsupervised learning algorithm proposed by J. MacQueen in 1967 and its core idea is divided the sample set of <img src="7-8101854\2a3cdd5d-2c6c-4132-9f18-849f3c1c00ee.jpg" /> into K clustering, where the samples belong to the same calss have the higher similarity and samples belong to different class have lower similarity [<xref ref-type="bibr" rid="scirp.26533-ref7">7</xref>]. Detailed processing algorithm process is listed below:</p><p>1) The initial clustering center is to select k sample of <img src="7-8101854\c64059a3-34eb-4588-9089-c36aea0bd490.jpg" /> discretionarily from n samples;</p><p>2) Calculate the distance from each sample to the clustering center in turn and then assign the sample to the class of the minimum distance value, which represents the most similar with the clustering center. According to the approximation degree between the clustering center and the other sample, a certain sample x can be classified by the follow way:</p><disp-formula id="scirp.26533-formula138849"><label>(1)</label><graphic position="anchor" xlink:href="7-8101854\01d1c12d-6665-4a94-8bfe-145c130efae1.jpg"  xlink:type="simple"/></disp-formula><p>The sample<img src="7-8101854\bd88c5a3-8b9d-4c17-8f28-7325ffd5c312.jpg" />belongs to the clustering <img src="7-8101854\751fa587-856a-4a5f-815a-d8f0b827a12a.jpg" /> which corresponds with the center<img src="7-8101854\273e850c-a6a5-468f-b7ea-1c1e46a701a3.jpg" />;<sub></sub></p><p>3) After initial classification, calculate the k iteration clustering center<img src="7-8101854\68c1df9d-8047-4ea7-a483-2d2940668f8d.jpg" />. It is different from the previous clustering center but will meet error sum of squares criterion:</p><disp-formula id="scirp.26533-formula138850"><label>(2)</label><graphic position="anchor" xlink:href="7-8101854\19776a80-e21f-453d-ba8a-657d143aea3a.jpg"  xlink:type="simple"/></disp-formula><p>make J minimum can get:</p><disp-formula id="scirp.26533-formula138851"><label>(3)</label><graphic position="anchor" xlink:href="7-8101854\87d5b5e7-102c-4127-9c54-a1471ab75b16.jpg"  xlink:type="simple"/></disp-formula><p>where <img src="7-8101854\ca1fd76f-0f14-4cac-9fdd-2d78f834ddd3.jpg" /> is the sample number of the class<img src="7-8101854\ed64aa03-8ab7-405c-ad4a-60256b3103ad.jpg" />;</p><p>4) Make a comparison with the previous iteration clustering center, if <img src="7-8101854\9009cf16-e3b7-41c9-91aa-cb32d0468600.jpg" /> switch to (2), otherwise switch to (5);</p><p>5) Output clustering results.</p><p>Girolam first puts forward the ideas that make kernel method combine with clustering method. He believes that the nonlinear mapping is smooth and continuous. In the high dimensional space, sample in topology will remain the same struck with the original space, and when the categorical distribution was not suprasphere or ellipsoid, the clustering algorithm based on kernel is still valid.</p><p>Assume that nonlinear mapping<img src="7-8101854\5704b8cc-727f-4b9d-8762-35a030b31bbb.jpg" />, mapping the sample<img src="7-8101854\563bcf32-7ee6-46a6-9f55-ef41fa7abaea.jpg" />of original space <img src="7-8101854\a2a9345f-068f-4fca-baf6-2e505deaa6be.jpg" />to the high dimensional feature space F, then in the feature space, the sample for classification will become: <img src="7-8101854\b02a826b-2eb1-4e57-9a54-94cc47a15c1a.jpg" />. The formula (2) which represents error sum of squares criterion function change into [<xref ref-type="bibr" rid="scirp.26533-ref8">8</xref>]:</p><disp-formula id="scirp.26533-formula138852"><label>(4)</label><graphic position="anchor" xlink:href="7-8101854\52b566a9-95a2-456b-91ef-d0a98bfc381f.jpg"  xlink:type="simple"/></disp-formula><p>where the mean value<img src="7-8101854\08533626-d9c2-4399-9d3e-6cfd57794358.jpg" />, and <img src="7-8101854\d3deb3f9-9af5-47a0-aa46-ae76e061eb4d.jpg" /> is the sample number of the jth class<img src="7-8101854\6790fa9b-57ad-45aa-8c2a-eb3031af91e5.jpg" />, so:</p><disp-formula id="scirp.26533-formula138853"><label>(5)</label><graphic position="anchor" xlink:href="7-8101854\1d05360c-b7d0-43ef-aec5-04249aaad001.jpg"  xlink:type="simple"/></disp-formula><p>where k(x,x) was kernel function.</p><p>The distance between any two points and feature space:</p><disp-formula id="scirp.26533-formula138854"><label>(6)</label><graphic position="anchor" xlink:href="7-8101854\9ffbbdbd-9f16-4e4d-b350-1072464c259d.jpg"  xlink:type="simple"/></disp-formula></sec><sec id="s3"><title>3. The Improved Kernel K-Mean Clustering Method</title><p>It can be found from formula (5) that, to calculate the distance between arbitrary point in feature space and kernel clustering center, it just need to know the original space coordinates of all samples which represented by kernel clustering center. The clustering center coordinates are always recessive in the original space or feature space which is just an intermediate variable. If determine the initial clustering center is equivalent to determine the initial classification and the next clustering center is equivalent to the new classification, so it can realize the kernel clustering according to the process as follows:</p><p>1) Determining the clustering categories k<img src="7-8101854\edf1e0aa-8dc6-484a-8e67-c6f1ed09020f.jpg" />;<sub></sub></p><p>2) Separating the sample set into k kind randomly;</p><p>3) Calculating the distance between each sample in feature space and clustering center according to the formula (5), then it can get a new sorting while assign the sample to the classes of the minimum distance;</p><p>4) Calculating the error sum of squares criterion function according to the formula (4), and judge whether criterion function is convergence or not, if it is not convergence, repeat the step 3), otherwise turn the step 5);</p><p>5) Outputting clustering results.</p><p>The process flow above is shown as in <xref ref-type="fig" rid="fig1">Figure 1</xref>.</p></sec><sec id="s4"><title>4. Case Study</title><p>In order to verify the effectiveness of the improved kernel cluster method, the test of fault diagnosis of roller bearing is developed.</p><sec id="s4_1"><title>4.1. The Vibration Collecting Experiments</title><p>The vibration collecting experiments were performed on The Machinery Fault Simulator (MFS) from Spectra Quest, Inc. shown in <xref ref-type="fig" rid="fig2">Figure 2</xref>. It can simulate most of faults that commonly occur in rotating machinery, such as misalignment, unbalance, resonance, rolling bearing faults, gearbox faults, and so on. The simulator has a range of operating speeds up to 6000 rpm. In this work, the simulator is composed of a motor, a coupling, a testing rolling bearing fitted on the left of the shaft near the motor, a working rolling bearing on the other side, a bearing load and a shaft. The MFS provides a bearing</p><p>fault kit consisting of one inner race defect, one outer race defect, one with ball defect, and one combination of defects for performing experiments and studying bearing fault diagnosis.</p><p>The shaft rotating speed was obtained by a laser speedometer. Acceleration signals were measured using the Dewetron 16 channels data acquisition system and the IMI 603C01 accelerometers with 10 kHz acquisition frequency rate. The data were stored in .mat format for further Matlab operation.</p></sec><sec id="s4_2"><title>4.2. Feature Extraction</title><p>Feature database 1: Time-domain statistical features of rolling bearing.</p><p>Vibration signals of rolling bearing with four fault models including normal, inner race defect, outer race defect and ball defect are taken for analysis. A total of 12 time-domain statistical features (shown in <xref ref-type="table" rid="table1">Table 1</xref>) are extracted from each vibration signal to constitute a fault sample. One hundred and ten fault samples from each model, total four hundred and forty samples are used to constitute the feature library 1.</p><p>Feature database 2: Energy features of rolling bearing in different frequency band based on 3-layer wavelet packet decomposition.</p><p>The 3-layer wavelet packet decomposition is performed on the vibration signals of roller bearing with four fault models. The different frequency branches are included in different layer of wavelet packet reconstructed signal, and when decomposing different fault mode of roller bearing, the reconstruction signals are different with each other. The result of decomposition is shown in <xref ref-type="fig" rid="fig3">Figure 3</xref>. The energy eigenvalue in each frequency</p><p><xref ref-type="table" rid="table1">Table 1</xref>. The time-domain statistical features.</p><p>branch is calculated to constitute the fault sample. One hundred and ten fault samples from each model, total four hundred and forty samples are used to constitute the feature library 2.</p><p>Feature database 3: The first eight intrinsic mode functions (IMFs) energy features of rolling bearing based on empirical mode decomposition (EMD).</p><p>The empirical mode decomposition is performed on the vibration signals of roller bearing with four fault models. The different frequency branch is included in different intrinsic mode functions. When decomposing different fault mode of roller bearing, the reconstruction signals are different with each other. The result of decomposition is shown in <xref ref-type="fig" rid="fig4">Figure 4</xref>. The energy eigenvalue in each frequency branch is calculated to constitute the fault sample. One hundred and ten fault samples from each model, total four hundred and forty samples are used to constitute the feature library 3.</p></sec><sec id="s4_3"><title>4.3. State Recognition and Fault Diagnosis</title><p>The improved kernel K-mean cluster analysis is conducted to the feature database 1 (time-domain statistical features of rolling bearing), the feature database 2 (energy features of rolling bearing in different frequency band based on 3-layer wavelet packet decomposition) and the feature database 3 (The first eight IMFs energy features of rolling bearing based on EMD). Gaussian kernel function is chosen and the kernel parameter σ is 100. A total of 10 experiments are conducted and average number is got. The results are shown in <xref ref-type="table" rid="table2">Table 2</xref>. It can be seen from the table that the improved kernel Kmean clustering algorithm is better than kernel K-mean clustering algorithm both in iterations and accuracy rate.</p></sec></sec><sec id="s5"><title>5. Conclusion</title><p>This paper studies on the kernel clustering methods. Aiming at the present deficiency of kernel K-mean clustering methods, this paper presents an executing process at kernel clustering method. The key of studying lies in the initial kernel cluster center and the iteration kernel clustering center. The case of roller bearing fault diagnosis indicated that the improved kernel K-mean clustering algorithm is better than kernel K-mean clustering algorithm both in iterations and accuracy rate. This study has significant instruction and reference value for the domain of fault diagnosis.</p></sec><sec id="s6"><title>6. Acknowledgements</title><p>This work is supported by the national natural science</p><p><xref ref-type="table" rid="table2">Table 2</xref>. A compare of different K-mean cluster methods.</p><p><img src="7-8101854\96503901-96b1-418f-8090-0434c9504fb1.jpg" /></p><p>foundation of China (51105138), the national high technology research and development program items (2012- AA041805), the Hunan province science and technology plan projects (2011GK3161), the scientific research fund of Hunan provincial education department (11C0530) and the aid program for science and technology innovative research team in higher educational institutions of Hunan province.</p></sec><sec id="s7"><title>REFERENCES</title></sec></body><back><ref-list><title>References</title><ref id="scirp.26533-ref1"><label>1</label><mixed-citation publication-type="other" xlink:type="simple">K. Yuchi, E. Yasunori and M. Sadaaki. “Indefinite Kernel Fuzzy c-Means Clustering Algorithms,” Lecture Notes in Computer Science, Vol. 6408, 2010, pp. 116-128. 
doi:10.1007/978-3-642-16292-3_13</mixed-citation></ref><ref id="scirp.26533-ref2"><label>2</label><mixed-citation publication-type="other" xlink:type="simple">G. Daniel, P. Witold, “Kernel-Based Fuzzy Clustering and Fuzzy Clustering: A Comparative Experimental Study,” Fuzzy Sets and Systems, Vol. 161, No. 4, 2010, pp. 522-543. doi:10.1016/j.fss.2009.10.021</mixed-citation></ref><ref id="scirp.26533-ref3"><label>3</label><mixed-citation publication-type="other" xlink:type="simple">C. Gu, S. Zhang, K. Liu and H. Huang, “Fuzzy Kernel K-Means Clustering Method Based on Immune Genetic Algorithm,” Journal of Computational Information Systems, Vol. 7, No. 1, 2011, pp. 221-231.</mixed-citation></ref><ref id="scirp.26533-ref4"><label>4</label><mixed-citation publication-type="other" xlink:type="simple">G. Wang, X. Li and K. He, “Kernel Local Fuzzy Clustering Margin Fisher Discriminant Method Faced on Fault Diagnosis,” Journal of Software, Vol. 6, No. 10, 2011, pp. 1993-2000.</mixed-citation></ref><ref id="scirp.26533-ref5"><label>5</label><mixed-citation publication-type="other" xlink:type="simple">L. Zhang, W. Zhou and L. Jiao, “Kernel Cluster Algorithm,” Chinese Journal of Computers, Vol. 25, No. 6, 2002, pp. 587-590.</mixed-citation></ref><ref id="scirp.26533-ref6"><label>6</label><mixed-citation publication-type="other" xlink:type="simple">R. Kong, G. Zhang, Z. Shi and G. Li, “Kernel-Based K-Mean Clustering,” Computer Engineer, Vol. 30, No. 11, 2004, pp. 12-13,80.</mixed-citation></ref><ref id="scirp.26533-ref7"><label>7</label><mixed-citation publication-type="other" xlink:type="simple">L. Kaufman and P. Rousseeuw, “Finding Groups in Data: An Introduction to Cluster Analysis,” Wiley Blackwell, Hoboken, 2005.</mixed-citation></ref><ref id="scirp.26533-ref8"><label>8</label><mixed-citation publication-type="other" xlink:type="simple">Z. Wu, W. Xie and J. Yu, “Fuzzy c-Means Clustering Algorithm Based on Kernel Method,” Proceedings of International Conference on Computational Intelligence and Multimedia Applications, 27-30 September 2003, pp. 49-54.</mixed-citation></ref></ref-list></back></article>