<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE article  PUBLIC "-//NLM//DTD Journal Publishing DTD v3.0 20080202//EN" "http://dtd.nlm.nih.gov/publishing/3.0/journalpublishing3.dtd"><article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" dtd-version="3.0" xml:lang="en" article-type="research article"><front><journal-meta><journal-id journal-id-type="publisher-id">WJET</journal-id><journal-title-group><journal-title>World Journal of Engineering and Technology</journal-title></journal-title-group><issn pub-type="epub">2331-4222</issn><publisher><publisher-name>Scientific Research Publishing</publisher-name></publisher></journal-meta><article-meta><article-id pub-id-type="doi">10.4236/wjet.2016.43C013</article-id><article-id pub-id-type="publisher-id">WJET-71299</article-id><article-categories><subj-group subj-group-type="heading"><subject>Articles</subject></subj-group><subj-group subj-group-type="Discipline-v2"><subject>Chemistry&amp;Materials Science</subject><subject> Engineering</subject></subj-group></article-categories><title-group><article-title>
 
 
  2D Part-Based Visual Tracking of Hydraulic Excavators
 
</article-title></title-group><contrib-group><contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Bo</surname><given-names>Xiao</given-names></name><xref ref-type="aff" rid="aff1"><sup>1</sup></xref></contrib><contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Ruiqi</surname><given-names>Chen</given-names></name><xref ref-type="aff" rid="aff2"><sup>2</sup></xref></contrib><contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Zhenhua</surname><given-names>Zhu</given-names></name><xref ref-type="aff" rid="aff1"><sup>1</sup></xref></contrib></contrib-group><aff id="aff1"><addr-line>Department of Building, Civil, and Environmental Engineering, Concordia University, Montreal, Canada</addr-line></aff><aff id="aff2"><addr-line>School of Information Science, University of Tampere, Tampere, Finland</addr-line></aff><pub-date pub-type="epub"><day>22</day><month>09</month><year>2016</year></pub-date><volume>04</volume><issue>03</issue><fpage>101</fpage><lpage>111</lpage><history><date date-type="received"><day>May</day>	<month>3,</month>	<year>2016</year></date><date date-type="rev-recd"><day>Accepted:</day>	<month>September</month>	<year>25,</year>	</date><date date-type="accepted"><day>September</day>	<month>28,</month>	<year>2016</year></date></history><permissions><copyright-statement>&#169; Copyright  2014 by authors and Scientific Research Publishing Inc. </copyright-statement><copyright-year>2014</copyright-year><license><license-p>This work is licensed under the Creative Commons Attribution International License (CC BY). http://creativecommons.org/licenses/by/4.0/</license-p></license></permissions><abstract><p>
 
 
   
   Visual tracking has been widely applied in construction industry and attracted signifi-cant interests recently. Lots of research studies have adopted visual tracking techniques on the surveillance of construction workforce, project productivity and construction safety. Until now, visual tracking algorithms have gained promising performance when tracking un-articulated equipmen
   t in construction sites. However, state-of-art tracking algorithms have unguaranteed performance in tracking articulated equipment, such as backhoes and excavators. The stretching buckets and booms are the main obstacles of successfully tracking articulated equipment. In order to fill this knowledge gap, the part-based tracking algorithms are introduced in this paper for tracking articulated equipment in construction sites. The part-based tracking is able to track different parts of target equipment while using multiple tracking algorithms at the same sequence. Some existing tracking methods have been chosen according to their outstanding performance in the computer vision community. Then, the part-based algorithms were created on the basis of selected visual tracking methods and tested by real construction sequences. In this way, the tracking performance was evaluated from effectiveness and robustness aspects. Throughout the quantification analysis, the tracking performance of articulated equipment was much more improved by using the part-based tracking algorithms. 
  
 
</p></abstract><kwd-group><kwd>Visual Tracking</kwd><kwd> Hydraulic Excavators</kwd><kwd> Construction Safety</kwd><kwd> Part-Based Tracking</kwd></kwd-group></article-meta></front><body><sec id="s1"><title>1. Introduction</title><p>Visual tracking is one of the most popular research fields in vision-based technologies and has made huge progresses in recent decades. In 1981, B. D. Lucas and T. Kanade firstly began to adopt holistic templates in tracking fields [<xref ref-type="bibr" rid="scirp.71299-ref1">1</xref>]. Then, for better describing the appearance changes, the subspace-based tracking methods have been widely used [<xref ref-type="bibr" rid="scirp.71299-ref2">2</xref>] [<xref ref-type="bibr" rid="scirp.71299-ref3">3</xref>]. So far, many visual features, such as histograms of Haar-like features [<xref ref-type="bibr" rid="scirp.71299-ref4">4</xref>], oriented gradients (HOG) [<xref ref-type="bibr" rid="scirp.71299-ref5">5</xref>] and co-variance region descriptor [<xref ref-type="bibr" rid="scirp.71299-ref6">6</xref>], have been developed in visual tracking. Recently, context information is considered as a helpful factor in visual tracking when the objects are partly or fully occluded [<xref ref-type="bibr" rid="scirp.71299-ref7">7</xref>].</p><p>The visual tracking has a variety of practical applications, such as human computer interaction, motion analysis, activity recognition, surveillance and medical imaging [<xref ref-type="bibr" rid="scirp.71299-ref8">8</xref>] [<xref ref-type="bibr" rid="scirp.71299-ref9">9</xref>]. Also, this technology has been applied in the construction industry. For example, visual tracking could be used to manage construction resources [<xref ref-type="bibr" rid="scirp.71299-ref10">10</xref>] and help managers know how many resources have been wasted in order to address inefficiency issues [<xref ref-type="bibr" rid="scirp.71299-ref11">11</xref>]. Also, tracking moving objects in construction sites could prevent potential collisions [<xref ref-type="bibr" rid="scirp.71299-ref12">12</xref>] and fall accidents [<xref ref-type="bibr" rid="scirp.71299-ref7">7</xref>]. Especially, vision-based tracking has been widely used in the earthmoving works to evaluate the productivity of equipment, such as excavators, loaders, dozers, and backhoes [<xref ref-type="bibr" rid="scirp.71299-ref13">13</xref>].</p><p>Earthmoving works is an important factor which affects the quality and cost of a project. As a mainly employed equipment in the earthmoving, hydraulic excavators has different sizes and can be used in digging foundations, drilling piles, and handling materials. Therefore, tracking excavators is a necessary technique to estimate working productivity. Although visual tracking algorithms have gained promising performance when tracking un-articulated equipment such as dozers, loaders and trucks, there is not a mature tracking algorithms to track hydraulic excavators. This is because the operation of excavators is complex and the activity range is too wide to be predicted. Some researchers have made great efforts on tracking excavators. Sougho and Tomohiro [<xref ref-type="bibr" rid="scirp.71299-ref14">14</xref>] applied RFID (Radio Frequency Identification) technology to identify hydraulic excavators in order to prevent collision accidents. And Ehsan et al. [<xref ref-type="bibr" rid="scirp.71299-ref15">15</xref>] tracked excavators through painting markers on the arms of excavators. However, all these techniques (marker and RFID sensor) are time- and money-consuming.</p><p>To address these issues, this study introduced the part-based 2D tracking methods to track the hydraulic excavators. First of all, three tracking algorithms: SCM tracker [<xref ref-type="bibr" rid="scirp.71299-ref16">16</xref>], KCF tracker [<xref ref-type="bibr" rid="scirp.71299-ref17">17</xref>], and STC tracker [<xref ref-type="bibr" rid="scirp.71299-ref18">18</xref>] were selected due to the desirable tracking performance in benchmark research studies of the computer vision community. These trackers were tested with multiple videos captured on real construction sites. Then, the KCF tracker is recognized as the most accurate tracker, while the STC tracker is recognized the most robust tracker. The two trackers were used to create two multiple-object tracking methods (called M-KCF and M-STC) for part-based tracking of hydraulic excavators. For potential better performance, the multiple-object tracking methods (called M-K-S), which combined KCF tracker and STC tracker were introduced. The M-KCF, M-STC and M-K-S tracker were further compared and discussed. It is improved that the part-based methods have significantly increased the tracking performance of excavators.</p></sec><sec id="s2"><title>2. Related Work</title><p>In this section, the recent research studies in 2D visual tracking methods were firstly introduced. Then, state-of-art research focused on visual tracking construction workforces was reviewed. Also, some widely accepted evaluation metrics to assess the performance of trackers in the benchmarks were illustrated.</p><sec id="s2_1"><title>2.1. 2D Visual Tracking Methods</title><p>In 1981, B.D. Lucas and T. Kanade [<xref ref-type="bibr" rid="scirp.71299-ref1">1</xref>] firstly adopted holistic templates for tracking. In order to seek better templates, lots of visual features, such as histograms of oriented gradients (HOG) [<xref ref-type="bibr" rid="scirp.71299-ref4">4</xref>], Haar-like features [<xref ref-type="bibr" rid="scirp.71299-ref5">5</xref>] and co-variance region descriptor [<xref ref-type="bibr" rid="scirp.71299-ref6">6</xref>], have been used for tracking technologies. Furthermore, the subspace-based tracking methods have been widely employed to describe the appearance changes. Meanwhile, the sparse-representation-based algorithms, which were proposed by Ling and Mei [<xref ref-type="bibr" rid="scirp.71299-ref2">2</xref>], have been improved [<xref ref-type="bibr" rid="scirp.71299-ref3">3</xref>]. So far, the deep learning [<xref ref-type="bibr" rid="scirp.71299-ref19">19</xref>] and machine learning [<xref ref-type="bibr" rid="scirp.71299-ref20">20</xref>] were widely developed in current researches and have got promising performance when tracking occlusion objects.</p><p>Generally, most short-term single-object model-free trackers are considered in the same framework which breaks a tracker into five components [<xref ref-type="bibr" rid="scirp.71299-ref21">21</xref>]. These components include motion model, feature extractor, observation model, model updater and ensemble post-processor. A tracking system is always initialized with given the position information of the bounding box of the target, then the motion model generates many candidate regions for prediction. Then the feature extractor converts these candidate regions into different features. And the observation model estimates the candidate regions’ possibility of being targets. Finally, the motion updater updates the observation model and provide the tracking results. In a tracking system, there may not include only one tracker, the ensemble post-processor would combine the prediction results of each tracker and provide the best estimation result.</p></sec><sec id="s2_2"><title>2.2. Visual Tracking in Construction</title><p>The visual tracking technology has been recently applied in the construction industry to facilitate construction automation. For example, it was used to do pothole distress assessment in pavement design [<xref ref-type="bibr" rid="scirp.71299-ref22">22</xref>], identify construction cumulative trauma disorders [<xref ref-type="bibr" rid="scirp.71299-ref23">23</xref>], recognize dirt loading cycles in excavation [<xref ref-type="bibr" rid="scirp.71299-ref13">13</xref>], and manage construction workforces in real-time [<xref ref-type="bibr" rid="scirp.71299-ref10">10</xref>]. Another essential application of tracking in construction is safety monitoring. It is well known that the possibility of fatalities in construction sites is quite large when compared to the scale of the workforce and to other industries. Visual tracking technologies help project managers to enhance the safety of workers when they are working in heights [<xref ref-type="bibr" rid="scirp.71299-ref7">7</xref>]. It is also feasible to locate workers and equipment in order to protect workers from potential collisions [<xref ref-type="bibr" rid="scirp.71299-ref12">12</xref>].</p><p>As an important equipment in construction, hydraulic excavators have attracted lots of interests in visual tracking. Some researchers used RFID technique to track excavators [<xref ref-type="bibr" rid="scirp.71299-ref14">14</xref>]. The RFID system consists of a reader and a tag. The RFID tag periodically makes the object identifiable by a battery, which has a unique ID, and the RFID reader receives this ID number information from the RFID tag. Therefore, excavators can be tracked through attaching a tag on it. Also marker-based methods are intended to detect excavators in harsh construction environments [<xref ref-type="bibr" rid="scirp.71299-ref15">15</xref>]. This technique requests painting different markers on the arms of excavator. Algorithms could even precisely detect and estimate the arm poses through detecting the boundaries of markers. Many effective libraries have been developed in marker-based research field, such as ARToolKit [<xref ref-type="bibr" rid="scirp.71299-ref24">24</xref>] and ARTag [<xref ref-type="bibr" rid="scirp.71299-ref25">25</xref>]. However, both installing RFID tags and painting markers are cost-consuming. Many construction sites cannot adopt these technologies due to the inconveniences. So it is important to develop a visual tracking method to track excavators with high performance in real time.</p></sec><sec id="s2_3"><title>2.3. Evaluation Criteria</title><p>How to fairly evaluate tracker’s performance remains a task in visual tracking fields. A reasonable evaluation system will help researchers to grasp tracker’s strengths and weaknesses. Typically, popular evaluation metrics, which adopted by lots of benchmarks are introduced as fellow:</p><p>・ The region overlap score [<xref ref-type="bibr" rid="scirp.71299-ref26">26</xref>] calculates the overlap region of prediction region from the whole area combing the tracker and the ground truth area. It is defined as<inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/71299x2.png" xlink:type="simple"/></inline-formula>, while <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/71299x3.png" xlink:type="simple"/></inline-formula> and <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/71299x4.png" xlink:type="simple"/></inline-formula> mean the intersection and union of two areas, respectively.</p><p>・ The center error [<xref ref-type="bibr" rid="scirp.71299-ref27">27</xref>] measures the average Euclidean distance between the central location of manual ground truths and tracked results. The average central location error over all frames is usually adopted to evaluate the performance of trackers.</p><p>・ The tracking length [<xref ref-type="bibr" rid="scirp.71299-ref28">28</xref>] reflects the robustness performance of trackers. This metric is calculated as the number of the frames from the first frame to the frame where its first failure.</p><p>・ The failure rate [<xref ref-type="bibr" rid="scirp.71299-ref29">29</xref>] is calculated as possibility of tracking failure in per image during the whole sequence. And this metric needs manually re-initialization when a tracker fails to track its targets.</p><p>Single evaluation metric is hard to reflect the robustness and effectiveness at the same time. So these evaluation metrics are always combined together. Nawaz and Cavallaro [<xref ref-type="bibr" rid="scirp.71299-ref30">30</xref>] proposed the Combined Tracking Performance Score (CoTPS) method to gain a comprehensive evaluation. In the CoTPS, the accuracy score is calculated as the number of successfully tracked frames, while the failure information is calculated on the base of the tracking length. On the other hand, Matej et al. [<xref ref-type="bibr" rid="scirp.71299-ref31">31</xref>] also considered the accuracy and robustness effectiveness into one graph in order to decide which tracker shows the better performance both in accuracy and robustness. The accuracy is reflected by the overlap score, while the robustness is measured by the times which the tracker fails to track the object during tracking.</p></sec></sec><sec id="s3"><title>3. Methodology</title><sec id="s3_1"><title>3.1. Trackers Selection</title><p>In this paper, authors selected three trackers (KCF, SCM and STC) from computer vision as the experiments trackers based on our knowledge. There exists many popular benchmark works which provide directions to us. KCF tracker and SCM tracker were selected because they have shown the promising performance in existing visual tracking benchmarks. In Wu’s benchmark work [<xref ref-type="bibr" rid="scirp.71299-ref20">20</xref>], the Robust Object Tracking via Sparsity- based Collaborative Model (SCM) [<xref ref-type="bibr" rid="scirp.71299-ref16">16</xref>] was ranked the first in occlusion, illumination and background clutter conditions and the second in the scale variation condition. From the comparisons in [<xref ref-type="bibr" rid="scirp.71299-ref31">31</xref>] by Matej et al., trackers were evaluated through the accuracy-robustness graph. In this benchmark, the Kernelized-Correlation Filter tracker (KCF) showed the best performance in accuracy and the second in overall performance. On the other hand, a super-fast algorithm which employed the spatio-temporal context information (STC) [<xref ref-type="bibr" rid="scirp.71299-ref18">18</xref>] was used. The STC tracker creates a spatial context model between the object and the background near the object in one scene. Then, this model will be updated with a spatio-temporal context model in the next frame and the best results is predicted when maximizing the confidence map.</p><p>In order to assess these single-object tracking algorithms’ strengths and weaknesses in construction scenarios, these trackers were tested by construction sequences which includes excavators, backhoes, trucks and workers. And the trackers are evaluated from accuracy and robustness respectively. For the accuracy evaluation, the average overlap score and center location error are employed for analysis. Because these two metrics are considered as the easiest to compute, interpret and describe the entire sequence. For the robustness analysis, the failure rate is employed here as its minimal annotation requirement. Also the failure rate can better describe the entire performance of trackers in robustness when comparing with the tracking length. Part of comparison results is showed in <xref ref-type="table" rid="table1">Table 1</xref>. According to the comparison work, the KCF tracker is the most accurate one with the better overlap score and lower center error, while the STC tracker showed the better performance in robustness with the lowest failure rate.</p></sec><sec id="s3_2"><title>3.2. Part-Based Tracking</title><p>It can be noticed that single-object tracking algorithms perform un-guaranteed in tracking excavators, especially in dirt-loading activities from comparison works. It is because the excavator buckets always rotate and move quickly in operations. Generally, an excavator includes four mainly tracking components: boom, dipper, bucket and “house” (driving cab). An excavator model which illustrates each component clearly is showed in <xref ref-type="fig" rid="fig1">Figure 1</xref>. The single-objects tracking algorithms usually focus on the house of the excavators because this component has biggest area and moves slowly. Because of the buckets move fast, it results in the ground truth tracking box changes quickly and hard to be predicted. Therefore, there are two initial tracking boxes adopted in this study, which is showed in the <xref ref-type="fig" rid="fig2">Figure 2</xref>. The first part is the “house” and grab rails, and the second part is bucket and dipper. And we find the two tracking boxes can always reflect the tracking box of the whole excavator.</p><p>Based on the STC algorithm of Zhang et al. [<xref ref-type="bibr" rid="scirp.71299-ref18">18</xref>], one more rectangle was added to represent the second target at the beginning of the algorithm. Therefore, two sets of confidence map and context prior models can be produced at the same time. So far, it has learned two spatial context models respectively. The maximum point of two confidence map will be the two targets’ location separately. This two-object algorithm is called M-STC. Adopting the similar concept of M-STC, the M-KCF tracker were created based on the KCF tracker. At the beginning of the KCF algorithm, two initial targets are defined in the first frame. Hence, for every frame, we extract dense features from the image in order to train the Gaussian kernel model. The target’s location in</p><table-wrap id="table1" ><label><xref ref-type="table" rid="table1">Table 1</xref></label><caption><title> Part of comparison of trackers in construction site</title></caption><table><tbody><thead><tr><th align="center" valign="middle"  colspan="2"  >Videos</th><th align="center" valign="middle" >KCF</th><th align="center" valign="middle" >SCM</th><th align="center" valign="middle" >STC</th></tr></thead><tr><td align="center" valign="middle"  rowspan="3"  >1</td><td align="center" valign="middle" >Average Overlap score</td><td align="center" valign="middle" >0.71</td><td align="center" valign="middle" >0.77</td><td align="center" valign="middle" >0.40</td></tr><tr><td align="center" valign="middle" >Center error (pixels)</td><td align="center" valign="middle" >8.97</td><td align="center" valign="middle" >12.45</td><td align="center" valign="middle" >14.06</td></tr><tr><td align="center" valign="middle" >Failure rate (350 frames)</td><td align="center" valign="middle" >0.29%</td><td align="center" valign="middle" >0.57%</td><td align="center" valign="middle" >0</td></tr><tr><td align="center" valign="middle"  rowspan="3"  >2</td><td align="center" valign="middle" >Average Overlap score</td><td align="center" valign="middle" >0.78</td><td align="center" valign="middle" >0.85</td><td align="center" valign="middle" >0.61</td></tr><tr><td align="center" valign="middle" >Center error (pixels)</td><td align="center" valign="middle" >78.78</td><td align="center" valign="middle" >58.32</td><td align="center" valign="middle" >61.48</td></tr><tr><td align="center" valign="middle" >Failure rate (500 frames)</td><td align="center" valign="middle" >0.20%</td><td align="center" valign="middle" >0.40%</td><td align="center" valign="middle" >0.20%</td></tr><tr><td align="center" valign="middle"  rowspan="3"  >3</td><td align="center" valign="middle" >Average Overlap score</td><td align="center" valign="middle" >0.79</td><td align="center" valign="middle" >0.84</td><td align="center" valign="middle" >0.73</td></tr><tr><td align="center" valign="middle" >Center error (pixels)</td><td align="center" valign="middle" >79.96</td><td align="center" valign="middle" >29.65</td><td align="center" valign="middle" >17.04</td></tr><tr><td align="center" valign="middle" >Failure rate (500 frames)</td><td align="center" valign="middle" >0</td><td align="center" valign="middle" >0</td><td align="center" valign="middle" >0</td></tr><tr><td align="center" valign="middle"  rowspan="3"  >4</td><td align="center" valign="middle" >Average Overlap score</td><td align="center" valign="middle" >0.85</td><td align="center" valign="middle" >0.75</td><td align="center" valign="middle" >0.74</td></tr><tr><td align="center" valign="middle" >Center error (pixels)</td><td align="center" valign="middle" >7.60</td><td align="center" valign="middle" >17.16</td><td align="center" valign="middle" >10.20</td></tr><tr><td align="center" valign="middle" >Failure rate (400 frames)</td><td align="center" valign="middle" >0.25%</td><td align="center" valign="middle" >0</td><td align="center" valign="middle" >0</td></tr><tr><td align="center" valign="middle"  rowspan="3"  >5</td><td align="center" valign="middle" >Average Overlap score</td><td align="center" valign="middle" >0.82</td><td align="center" valign="middle" >0.81</td><td align="center" valign="middle" >0.50</td></tr><tr><td align="center" valign="middle" >Center error (pixels)</td><td align="center" valign="middle" >47.05</td><td align="center" valign="middle" >51.13</td><td align="center" valign="middle" >47.92</td></tr><tr><td align="center" valign="middle" >Failure rate (500 frames)</td><td align="center" valign="middle" >0</td><td align="center" valign="middle" >0</td><td align="center" valign="middle" >0</td></tr></tbody></table></table-wrap><fig id="fig1"  position="float"><label><xref ref-type="fig" rid="fig1">Figure 1</xref></label><caption><title> Model of the excavator structure (CAT@5100B)</title></caption><graphic mimetype="image"   position="float"  xlink:type="simple"  xlink:href="http://html.scirp.org/file/71299x5.png"/></fig><fig id="fig2"  position="float"><label><xref ref-type="fig" rid="fig2">Figure 2</xref></label><caption><title> Example of initial positions of tracking boxes</title></caption><graphic mimetype="image"   position="float"  xlink:type="simple"  xlink:href="http://html.scirp.org/file/71299x6.png"/></fig><p>next frame will be automatically stored and visualized. The STC tracker was assessed the better robust tracker in the comparison part. Therefore, it may be better if we use STC to track the part of bucket and dipper, which is hard to be tracked because of the high moving speed. So the STC and KCF tracker are combined together to track two targets respectively, which named M-K-S. For the results of three multiple trackers, each algorithm computes the coordinator of two targets tracking boxes. Based on these coordinators, the extra code is added to plot a big rectangle which contains two targets. After that, the performance of multiple trackers can be compared with manually annotated ground truths.</p></sec></sec><sec id="s4"><title>4. Experiment Results</title><p>In this experiment, the datasets were tested in the platform of Matlab R2014b, a 64-bit operating system, Microsoft Windows 7 Enterprise. And the hardware configuration includes an Intel&#174; i7-4720HQ CPU @2.60 GHz (central processing Unit), a 16 gigabytes memory, and an NVIDIA&#174; GeForce&#174; GTX 965M with 2GB GDDR5 GPU (graphic processing unit). Three sequences were used in this study and all sequences are loading dirt and in the night time. It means the tracking conditions such as the motion blur, low resolution and background clutter are tough. In this study, it used average overlap score, center location error to evaluate the performance of three single-object algorithms and three multiple algorithms. The overlap score reflects the accuracy of trackers. And the center error measures the ability that tracking boxes follow the ground truth boxes. Some example sequences of evaluation results are showed in the <xref ref-type="fig" rid="fig3">Figure 3</xref>. The tracking performance is illustrated in the following <xref ref-type="table" rid="table2">Table 2</xref>.</p></sec><sec id="s5"><title>5. Conclusions</title><p>It is obvious that the part-based algorithms have more accurate and effective performance than three single-object algorithms. The mean value of average overlap score of M-STC, M-KCF, and M-K-S is 0.86, while the mean value of rest of trackers is 0.57. And part-based algorithms also perform remarkable in center error with 15.47 pixels in</p><fig-group id="fig3"><label><xref ref-type="fig" rid="fig3">Figure 3</xref></label><caption><title> Examples of tracking results in the Frame 300 of testing trackers. (a) KCF tracking result in Frame 300; (b) SCM tracking result in Frame 300; (c) STC tracking result in Frame 300; (d) M-STC tracking result in Frame 300; (e) M-KCF tracking result in Frame 300; (f) M-S-K tracking result in Frame 300.</title></caption><fig id ="fig3_1"><label> (b)</label><graphic mimetype="image"   position="float"  xlink:type="simple"  xlink:href="http://html.scirp.org/file/71299x7.png"/></fig><fig id ="fig3_2"><label>(c)</label><graphic mimetype="image"   position="float"  xlink:type="simple"  xlink:href="http://html.scirp.org/file/71299x8.png"/></fig><fig id ="fig3_3"><label> (d)</label><graphic mimetype="image"   position="float"  xlink:type="simple"  xlink:href="http://html.scirp.org/file/71299x9.png"/></fig><fig id ="fig3_4"><label>(e)</label><graphic mimetype="image"   position="float"  xlink:type="simple"  xlink:href="http://html.scirp.org/file/71299x10.png"/></fig><fig id ="fig3_5"><label> (f)</label><graphic mimetype="image"   position="float"  xlink:type="simple"  xlink:href="http://html.scirp.org/file/71299x11.png"/></fig><fig id ="fig3_6"><label></label><graphic mimetype="image"   position="float"  xlink:type="simple"  xlink:href="http://html.scirp.org/file/71299x12.png"/></fig></fig-group><table-wrap id="table2" ><label><xref ref-type="table" rid="table2">Table 2</xref></label><caption><title> Tracking performance of experiment trackers</title></caption><table><tbody><thead><tr><th align="center" valign="middle"  colspan="2"   rowspan="2"  >Video</th><th align="center" valign="middle"  colspan="3"  >Single-object Trackers</th><th align="center" valign="middle"  colspan="3"  >Part-based Trackers</th></tr></thead><tr><td align="center" valign="middle" >KCF</td><td align="center" valign="middle" >SCM</td><td align="center" valign="middle" >STC</td><td align="center" valign="middle" >M-STC</td><td align="center" valign="middle" >M-KCF</td><td align="center" valign="middle" >M-K-S</td></tr><tr><td align="center" valign="middle"  rowspan="2"  >1</td><td align="center" valign="middle" >Average Overlap score</td><td align="center" valign="middle" >0.64</td><td align="center" valign="middle" >0.63</td><td align="center" valign="middle" >0.44</td><td align="center" valign="middle" >0.78</td><td align="center" valign="middle" >0.87</td><td align="center" valign="middle" >0.93</td></tr><tr><td align="center" valign="middle" >Center error (pixels)</td><td align="center" valign="middle" >86.01</td><td align="center" valign="middle" >94.00</td><td align="center" valign="middle" >89.96</td><td align="center" valign="middle" >13.58</td><td align="center" valign="middle" >18.89</td><td align="center" valign="middle" >13.94</td></tr><tr><td align="center" valign="middle"  rowspan="2"  >2</td><td align="center" valign="middle" >Average Overlap score</td><td align="center" valign="middle" >0.68</td><td align="center" valign="middle" >0.68</td><td align="center" valign="middle" >0.54</td><td align="center" valign="middle" >0.71</td><td align="center" valign="middle" >0.87</td><td align="center" valign="middle" >0.75</td></tr><tr><td align="center" valign="middle" >Center error (pixels)</td><td align="center" valign="middle" >116.44</td><td align="center" valign="middle" >120.77</td><td align="center" valign="middle" >118.23</td><td align="center" valign="middle" >36.18</td><td align="center" valign="middle" >25.53</td><td align="center" valign="middle" >44.90</td></tr></tbody></table></table-wrap><p>average, while single-object algorithms got 89.99 pixels in average center error. It proves that dividing the excavators into two parts and tracking them separately at the same time really enhances the tracking results. In this study, we created the M-K-S tracker which combines STC and KCF together. This tracker used STC to track the bucket part, which moves with high-speed and accurate KCF to track “house” part. And the M-K-S tracker actually achieved the best performance among these six trackers with 0.93 in average overlap score.</p><p>In this study, the part-based 2D tracking methods were introduced to track the hydraulic excavators. Three tracking algorithms: SCM, KCF, and STC were selected out based on the desirable performance in benchmark studies. These trackers were tested and compared with construction videos. Then, the KCF tracker and STC tracker were used to create part-based trackers for tracking hydraulic excavators. Finally, all six trackers were tested by excavator videos and the part-based methods have better performance than single-object algorithms.</p><p>In fact, this concept also could be used in tracking other equipment. The two-object algorithms can be changed to three, four or more objects algorithms in order to track more complex equipment and activities in construction. On the other hand, the single- object trackers used in this study can be replaced with other better performed trackers and it is supposed to receive better results. There exist certain limitations here. Because of the limited space of this paper, the tracking time and robustness of trackers have not been considered which are important in visual tracking. More objects tracked, much time is spent. When the target is divided into some parts, it is easier to lose the quickly moving part and results in the decreasing of robustness. And the part-based algorithms may not make breakthroughs in tracking occlusions because it cannot exceed the ability of original trackers.</p></sec><sec id="s6"><title>Cite this paper</title><p>Xiao, B., Chen, R.Q. and Zhu, Z.H. (2016) 2D Part-Based Visual Tracking of Hydraulic Excavators. World Journal of Engineering and Tech- nology, 4, 101-111. http://dx.doi.org/10.4236/wjet.2016.43C013</p></sec></body><back><ref-list><title>References</title><ref id="scirp.71299-ref1"><label>1</label><mixed-citation publication-type="other" xlink:type="simple">Lucas, B.D. and Kanade, T. (1981) An Iterative Image Registration Technique with an Application to Stereo Vision. Proc. 7th International Joint Conference on Artificial Intelligence, 674-679.</mixed-citation></ref><ref id="scirp.71299-ref2"><label>2</label><mixed-citation publication-type="other" xlink:type="simple">Mei, X. and Ling, H. (2011) Robust Visual Tracking and Vehicle Classification via Sparse Representation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 33, 2259- 2272. http://dx.doi.org/10.1109/TPAMI.2011.66</mixed-citation></ref><ref id="scirp.71299-ref3"><label>3</label><mixed-citation publication-type="other" xlink:type="simple">Jia, X., Lu, H. and Yang, M.-H. (2012) Visual Tracking via Adaptive Structurallocal Sparse Appearance Model. Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 1822-1829.</mixed-citation></ref><ref id="scirp.71299-ref4"><label>4</label><mixed-citation publication-type="other" xlink:type="simple">Dalal, N. and Triggs, B. (2005) Histograms of Oriented Gradients for Human Detection. IEEE Computer Society Conference on Computer Vision and Pattern Recogni-tion, CVPR’2005, 886-893. http://dx.doi.org/10.1109/cvpr.2005.177</mixed-citation></ref><ref id="scirp.71299-ref5"><label>5</label><mixed-citation publication-type="other" xlink:type="simple">Viola, P. and Jones, M.J. (2004) Robust Real-Time Face Detection. International Journal of Computer Vision, 57, 137-154. http://dx.doi.org/10.1023/B:VISI.0000013087.49260.fb</mixed-citation></ref><ref id="scirp.71299-ref6"><label>6</label><mixed-citation publication-type="other" xlink:type="simple">Tuzel, O., Porikli, F. and Meer, P. (2006) Region Covariance: A Fastdescriptor for Detection and Classification. European Conference on Computer Vision. Springer Berlin Heidelberg.</mixed-citation></ref><ref id="scirp.71299-ref7"><label>7</label><mixed-citation publication-type="other" xlink:type="simple">Gong, J. and Caldas, C.H. (2011) An Object Recognition, Tracking, and Contextual Reasoning-Based Video Interpretation Method for Rapid Productivity Analysis of Construction Operations. Automation in Construction, 20, 1211-1226.  
http://dx.doi.org/10.1016/j.autcon.2011.05.005</mixed-citation></ref><ref id="scirp.71299-ref8"><label>8</label><mixed-citation publication-type="other" xlink:type="simple">Cannons, K. (2008) A Review of Visual Tracking. York Univ., Ontario, Canada, Tech. Rep. CSE-2008-07.</mixed-citation></ref><ref id="scirp.71299-ref9"><label>9</label><mixed-citation publication-type="other" xlink:type="simple">Yilmaz, A., Javed, O. and Shah, M. (2006) Object Tracking: A Survey. ACM Computing Surveys, 38, 1-45. http://dx.doi.org/10.1145/1177352.1177355</mixed-citation></ref><ref id="scirp.71299-ref10"><label>10</label><mixed-citation publication-type="other" xlink:type="simple">Weerasinghe, I.P.T. and Ruwanpura, J.Y. (2009) Automated Data Acquisition System to Assess Construction Worker Performance. Proceedings of 2009 Construction Research Congress, ASCE, Reston, VA, 11-20. http://dx.doi.org/10.1061/41020(339)7</mixed-citation></ref><ref id="scirp.71299-ref11"><label>11</label><mixed-citation publication-type="other" xlink:type="simple">Park, M.W., Makhmalbaf, A. and Brilakis, I. (2011) Comparative Study of Vision Tracking Methods for Tracking of Construction Site Resources. Automation in Construction, 20, 905-915. http://dx.doi.org/10.1016/j.autcon.2011.03.007</mixed-citation></ref><ref id="scirp.71299-ref12"><label>12</label><mixed-citation publication-type="other" xlink:type="simple">Han, S. and Lee, S. (2013) A Vision-Based Motion Capture and Recognition Framework for Behavior-Based Safety Management. Automation in Construction, 35, 131-141.  
http://dx.doi.org/10.1016/j.autcon.2013.05.001</mixed-citation></ref><ref id="scirp.71299-ref13"><label>13</label><mixed-citation publication-type="other" xlink:type="simple">Rezazadeh Azar, E. and McCabe, B. (2012) Vision-Based Recognition of Dirt Loading Cycles in Construction Sites. Construction Research Congress, 1042-1051.</mixed-citation></ref><ref id="scirp.71299-ref14"><label>14</label><mixed-citation publication-type="other" xlink:type="simple">Chae, S. and Yoshida, T. (2010) Application of RFID Technology to Prevention of Collision Accident with Heavy Equipment. Automation in Construction, 19, 368-374.  
http://dx.doi.org/10.1016/j.autcon.2009.12.008</mixed-citation></ref><ref id="scirp.71299-ref15"><label>15</label><mixed-citation publication-type="other" xlink:type="simple">Azar, E.R., Feng, C. and Kamat, V.R. (2015) Feasibility of In-Plane and Articulation Monitoring of Excavators arm using Planar Marker. Journal of Information Technology in Construction (ITcon), 20, 213-229.</mixed-citation></ref><ref id="scirp.71299-ref16"><label>16</label><mixed-citation publication-type="other" xlink:type="simple">Zhong, W., Lu, H. and Yang, M.-H. (2014) Robust Object Tracking via Sparse Collaborative Appearance Model. IEEE Transactions on Image Processing, 23, 2356-2368.  
http://dx.doi.org/10.1109/TIP.2014.2313227</mixed-citation></ref><ref id="scirp.71299-ref17"><label>17</label><mixed-citation publication-type="other" xlink:type="simple">Henriques, J.F., Caseiro, R., Martins, P. and Batista, J. (2014) High-Speed Tracking with Kernelized Correlation Filters. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1, 125-141.</mixed-citation></ref><ref id="scirp.71299-ref18"><label>18</label><mixed-citation publication-type="other" xlink:type="simple">Zhang, K.H., Zhang, L., Yang, M.-H. and Zhang, D. (2013) Fast Tracking via Spatio-  Temporal Context Learning. arXiv preprint arXiv: 1311.1939.</mixed-citation></ref><ref id="scirp.71299-ref19"><label>19</label><mixed-citation publication-type="other" xlink:type="simple">Wang, N., Li, S., Gupta, A. and Yeung, D.Y. (2015) Transferring Rich Feature Hierarchies for Robust Visual Tracking. arXivpreprint arXiv: 1501.04587.</mixed-citation></ref><ref id="scirp.71299-ref20"><label>20</label><mixed-citation publication-type="other" xlink:type="simple">Kristan, M., Pflugfelder, R., Leonardis, A., Matas, J., Cehovin, L., Nebehay, G., Vojir, T., Fernández, G., et al. (2014) The Visual Object Tracking vot2014 Challenge Results. In: ECCV2014 Workshops, Workshop on Visual Object Tracking Challenge, Volume 8926, 191-217.</mixed-citation></ref><ref id="scirp.71299-ref21"><label>21</label><mixed-citation publication-type="other" xlink:type="simple">Nawaz, T. and Cavallaro, A. (2012) A Protocol for Evaluating Video Trackers under Real-World Conditions. IEEE Transactions on Image Processing, 22, 1354-1361.</mixed-citation></ref><ref id="scirp.71299-ref22"><label>22</label><mixed-citation publication-type="other" xlink:type="simple">Khan, Z., Balch, T. and Dellaert, F. (2005) MCMC-Based Particle Filtering for Tracking a Variable Number of Interacting Targets. TPAMI, 27, 1805-1819.  
http://dx.doi.org/10.1109/TPAMI.2005.223</mixed-citation></ref><ref id="scirp.71299-ref23"><label>23</label><mixed-citation publication-type="other" xlink:type="simple">Kwon, J. and Lee, K.M. (2009) Tracking of a Non-Rigid Object via Patch-Based Dynamic Appearance Modeling and Adaptive Basin Hopping Monte Carlo Sampling. In: CVPR, 1208-1215.</mixed-citation></ref><ref id="scirp.71299-ref24"><label>24</label><mixed-citation publication-type="other" xlink:type="simple">Babenko, B., Yang, M.H. and Be-longie, S. (2011) Robust Object Tracking with Online Multiple Instance Learning. IEEE Transactions on Pattern Analysis and Machine Intelligence, 33, 1619-1632. http://dx.doi.org/10.1109/TPAMI.2010.226</mixed-citation></ref><ref id="scirp.71299-ref25"><label>25</label><mixed-citation publication-type="other" xlink:type="simple">Smeulders, A.W.M., Chu, D.M., Cucchiara, R., Calderara, S., Dehghan, A. and Shah, M. (2013) Visual Tracking: An Expe-rimental Survey. IEEE Transactions on Pattern Analysis and Machine Intelligence, 36, 1442-1468.</mixed-citation></ref><ref id="scirp.71299-ref26"><label>26</label><mixed-citation publication-type="other" xlink:type="simple">Fiala, M. (2005) ARTag, a Fiducial Marker System Using Digital Techniques. Proceed-ings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), San Diego, Vol. 2, 590-596.</mixed-citation></ref><ref id="scirp.71299-ref27"><label>27</label><mixed-citation publication-type="other" xlink:type="simple">Kato, H. and Billinghurst, M. (1999) Marker Tracking and hmd Calibration for a Video-Based Augmented Reality Conferencing System. Proceedings of the 2nd IEEE and ACM International Workshop on Augmented Reality, 85-94.  
http://dx.doi.org/10.1109/IWAR.1999.803809</mixed-citation></ref><ref id="scirp.71299-ref28"><label>28</label><mixed-citation publication-type="other" xlink:type="simple">Rempel, D.M., Harrison, R.J. and Barnhart, S. (1992) Work-Related Cumulative Trauma Disorders of the Upper Extremity. JAMA, 267, 838-842.  
http://dx.doi.org/10.1001/jama.1992.03480060084035</mixed-citation></ref><ref id="scirp.71299-ref29"><label>29</label><mixed-citation publication-type="other" xlink:type="simple">Koch, C., Jog, G. and Brilakis, I. (2013) Automated Pothole Distress Assessment Using Asphalt Pavement Video Data. Journal of Computing in Civil Engineering, 27, 370-378.  
http://dx.doi.org/10.1061/(asce)cp.1943-5487.0000232</mixed-citation></ref><ref id="scirp.71299-ref30"><label>30</label><mixed-citation publication-type="other" xlink:type="simple">Wang, N., Shi, J., Yeung, D.Y. and Jia, J. (2015) Understanding and Diagnosing Visual Tracking Systems. Proceedings of the IEEE International Conference on Computer Vision, 3101-3109. http://dx.doi.org/10.1109/iccv.2015.355</mixed-citation></ref><ref id="scirp.71299-ref31"><label>31</label><mixed-citation publication-type="other" xlink:type="simple">Wu, Y., Lim, J. and Yang, M.H. (2015) Object Tracking Benchmark. IEEE Transactions on Pattern Analysis and Machine Intelligence, 37.</mixed-citation></ref></ref-list></back></article>