<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE article  PUBLIC "-//NLM//DTD Journal Publishing DTD v3.0 20080202//EN" "http://dtd.nlm.nih.gov/publishing/3.0/journalpublishing3.dtd"><article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" dtd-version="3.0" xml:lang="en" article-type="research article"><front><journal-meta><journal-id journal-id-type="publisher-id">POS</journal-id><journal-title-group><journal-title>Positioning</journal-title></journal-title-group><issn pub-type="epub">2150-850X</issn><publisher><publisher-name>Scientific Research Publishing</publisher-name></publisher></journal-meta><article-meta><article-id pub-id-type="doi">10.4236/pos.2010.11004</article-id><article-id pub-id-type="publisher-id">POS-3161</article-id><article-categories><subj-group subj-group-type="heading"><subject>Articles</subject></subj-group><subj-group subj-group-type="Discipline-v2"><subject>Computer Science&amp;Communications</subject></subj-group></article-categories><title-group><article-title>
 
 
  A Yield Mapping Procedure Based on Robust Fitting Paraboloid Cones on Moving Elliptical Neighborhoods and the Determination of Their Size Using a Robust Variogram
 
</article-title></title-group><contrib-group><contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>artin</surname><given-names>Bachmaier</given-names></name><xref ref-type="aff" rid="aff1"><sub>1</sub></xref><xref ref-type="corresp" rid="cor1"><sup>*</sup></xref></contrib></contrib-group><aff id="aff1"><label>1</label><addr-line>Technische Universit?t München</addr-line></aff><author-notes><corresp id="cor1">* E-mail:<email>bachmai@wzw.tum.de</email></corresp></author-notes><pub-date pub-type="epub"><day>29</day><month>11</month><year>2010</year></pub-date><volume>01</volume><issue>01</issue><fpage>27</fpage><lpage>41</lpage><history><date date-type="received"><day>November</day>	<month>10th,</month>	<year>2009</year></date><date date-type="rev-recd"><day>August</day>	<month>21st,</month>	<year>2010</year>	</date><date date-type="accepted"><day>August</day>	<month>25th,</month>	<year>2010</year></date></history><permissions><copyright-statement>&#169; Copyright  2014 by authors and Scientific Research Publishing Inc. </copyright-statement><copyright-year>2014</copyright-year><license><license-p>This work is licensed under the Creative Commons Attribution International License (CC BY). http://creativecommons.org/licenses/by/4.0/</license-p></license></permissions><abstract><p>
 
 
  The yield map is generated by fitting the yield surface shape of yield monitor data mainly using paraboloid cones on floating neighborhoods. Each yield map value is determined by the fit of such a cone on an elliptical neighborhood that is wider across the harvest tracks than it is along them. The coefficients of regression for modeling the paraboloid cones and the scale parameter are estimated using robust weighted M-estimators where the weights decrease quadratically from 1 in the middle to zero at the border of the selected neighborhood. The robust way of estimating the model parameters supersedes a procedure for detecting outliers. For a given neighborhood shape, this yield mapping method is implemented by the Fortran program paraboloidmapping.exe, which can be downloaded from the web. The size of the selected neighborhood is considered appropriate if the variance of the yield map values equals the variance of the true yields, which is the difference between the variance of the raw yield data and the error variance of the yield monitor. It is estimated using a robust variogram on data that have not had the trend removed.
 
</p></abstract><kwd-group><kwd>Precision Agriculture</kwd><kwd> Yield Mapping</kwd><kwd> GPS</kwd><kwd> Elliptical Neighborhood</kwd><kwd> Paraboloid</kwd><kwd> Weighted Regression</kwd><kwd> Redescending M-estimate</kwd><kwd> Robust Variogram</kwd></kwd-group></article-meta></front><body><sec id="s1"><title>1. Introduction</title><p>The yield mapping method I extensively describe here follows Bachmaier [<xref ref-type="bibr" rid="scirp.3161-ref1">1</xref>] in many parts. It does not use filtering techniques to remove outlying yield measurements that are caused mainly by the monitoring process. According to Simbahan et al. [<xref ref-type="bibr" rid="scirp.3161-ref2">2</xref>], such errors include grain flow and other sensor errors (moisture, speed, swath width), errors due to geo-referencing and combine movement, operator errors and data processing errors (Shearer et al. [<xref ref-type="bibr" rid="scirp.3161-ref3">3</xref>], Blackmore and Moore [<xref ref-type="bibr" rid="scirp.3161-ref4">4</xref>]; Arslan and Colvin [<xref ref-type="bibr" rid="scirp.3161-ref5">5</xref>]). Steinmayr [<xref ref-type="bibr" rid="scirp.3161-ref6">6</xref>] gives a concise review of possible sources of error, their cause and impact and the corresponding filtering techniques (e.g. Thyl&#233;n et al. [7,8], Noack et al. [<xref ref-type="bibr" rid="scirp.3161-ref9">9</xref>]).</p><p>The yield map values determined by my method does not depend on the removal of outliers, but from fitting the yield surface of the raw data using paraboloid cones on floating neighborhoods. By choosing these neighborhoods across the tracks wider than along them, the yield map can be adapted better to changes in yield along the tracks than across them. This is an advantage over other methods that often do not smooth sufficiently across the tracks. The parameters for modeling a paraboloid cone are estimated by robust weighted M-estimators (cf. Hampel et al. [<xref ref-type="bibr" rid="scirp.3161-ref10">10</xref>], Section 6.3) so that the influence of outliers is automatically restricted or completely annihilated. Therefore, discarding or downweighting values that deviate too much from a paraboloid yield surface around a neighborhood, as done by Bachmaier and Auernhammer [<xref ref-type="bibr" rid="scirp.3161-ref11">11</xref>], is not necessary. However, measurements that are wrong for technical reasons, such as values where the combine enters the harvest tracks, are assigned zero weight, which corresponds to removing them. The use of weights has the additional advantage that, contrary to all other filtering techniques in the literature mentioned above, there is no need to decide whether a measurement should be removed (weight 0) or not (weight 1). A measurement can be assigned any weight between 0 and 1, so the weights indicate how likely it is that the measurement is correct. In particular, there is no need to define limits for speed, moisture or swath width outside which the corresponding yield measurements should be discarded. The influence of a measurement on the yield map is greater, the greater its weight is. Only a measurement with weight zero has no influence at all, which corresponds to its being canceled.</p><p>The example in this paper uses a data set where the yields have been converted to dry matter yields. Time and header status were not measured. Therefore for every measurement the distance to preceding harvest paths was calculated. A distance that is considerably smaller than the cutting width of the combine indicates that the measurement cannot have come from a full swath, so it might be incorrect. The distance to preceding paths also downweights measurements because of end-path delays if the combine driver did not lift the header after harvesting a path. The GPS-points of the following low yield measurements are usually close to other harvest paths around the boundary of the field. I applied a smooth method of weighting to reduce the effect of such dubious measurements.</p></sec><sec id="s2"><title>2. The Yield Mapping Method</title><sec id="s2_1"><title>2.1. Modeling Planes and Paraboloid Cones</title><p>A paraboloid cone (<xref ref-type="fig" rid="fig1">Figure 1</xref>) is modeled for determining a yield map value at <img src="4-8500013\562c381b-a9f9-43f8-a055-13b9018101b6.jpg" /> on a neighborhood around <img src="4-8500013\2a0bf110-713f-42db-93ba-a98723a33ef3.jpg" /> where only a measurement on <img src="4-8500013\32e1d4d9-0a29-4d95-982e-80066ff839e0.jpg" /> can have full weight. The weights of other points within the selected neighborhood decrease smoothly to zero at the border of the selected neighborhood. These weights are called local weights, whereas those for downweighting dubious values are global weights.</p><p>Fitting paraboloid cones is only meaningful if one assumes that the true unknown yield surface is smooth. From a mathematical point of view, it should be a twice continuously differentiable function of the Gauss-Kr&#252;ger coordinates. Fields where any rectangles were cut out, as often occurs in agricultural trials, are to be excluded from consideration.</p><p>But what is the true yield at a point of the field? Imagine a point as a 1 cm <img src="4-8500013\51b043da-e994-4e8f-9b69-b25ca4a01ad0.jpg" /> 1 cm square. One could say that if there is a wheat ear at this point, the yield (in Mg ha<sup>-1</sup>) is tremendously large (as the area of 1 cm<sup>2</sup> is so small), whereas the yield is zero if there is no ear. Such a yield map would have a huge microscale variation, and that is not desirable. It is sensible that the true yield denotes the</p><p>true average yield of a rectangle around this point, the rectangle to be harvested for measuring its yield with the monitoring process. Thus, the true yield surface is continuous. It has no microscale variation because points close to each other refer to almost the same area.</p><p>Every continuously differentiable function defined on a two-dimensional area can be approximated by a skewed plane if it is confined to a small environment. A paraboloid cone, however, approximates a twice continuously differentiable function better as it can also take account of hilltops or depressions. This allows a neighborhood that is considerably larger. Paraboloid cones arise from approximating smooth functions by a two-dimensional Taylor series that includes only linear and quadratic terms. <xref ref-type="fig" rid="fig1">Figure 1</xref> shows some examples of what a paraboloid cone can look like.</p><p>The two-dimensional quadratic model for fitting paraboloid cones is</p><p><img src="4-8500013\f61f77e0-796d-4895-abf5-dc2cd88422f3.jpg" /></p><disp-formula id="scirp.3161-formula102485"><label>(1)</label><graphic position="anchor" xlink:href="4-8500013\b494e1ca-9eac-40e4-9cc1-dadcaf01eac0.jpg"  xlink:type="simple"/></disp-formula><p>where z<sub>i</sub> is the measured yield at the i<sup>th</sup> site<img src="4-8500013\9f001588-19c7-459b-a473-8eee2695442e.jpg" />. For reasons of numeric stability, the coordinates x<sub>i</sub> and y<sub>i</sub> should be the mean adjusted Gauss-Kr&#252;ger coordinates</p><disp-formula id="scirp.3161-formula102486"><label>(2)</label><graphic position="anchor" xlink:href="4-8500013\7127f5db-8c98-4ded-92fe-febde0f76904.jpg"  xlink:type="simple"/></disp-formula><p>where <img src="4-8500013\1f9efeae-0616-472f-9800-f9ea6cea4a56.jpg" /> and <img src="4-8500013\2d47397f-4ffc-43b5-b9f2-ce17b1e3124b.jpg" /> are the means of the Gauss-Kr&#252;ger coordinates <img src="4-8500013\67f40c68-d711-4f5a-809c-aa1fbc6a81bd.jpg" /> and <img src="4-8500013\c65a62fc-998e-4955-9158-c3b667bc71e1.jpg" /> over all points used for fitting the cone. The random variables <img src="4-8500013\5aeed3cb-d415-4f4b-8086-d85e47b9ec04.jpg" /> in (1) denote the errors, which should be around zero.</p><p>In extrapolation or where there are few valid measurements around a point<img src="4-8500013\2ce9486c-33c6-4503-b098-8cb899d82b34.jpg" />, the yield value should be obtained using a skewed plane rather than a paraboloid cone because the latter can lead to exaggerated values. A skewed plane is modeled by</p><disp-formula id="scirp.3161-formula102487"><label>(3)</label><graphic position="anchor" xlink:href="4-8500013\b23a8b5b-2aa4-49ea-aa9b-1019d3122873.jpg"  xlink:type="simple"/></disp-formula><p>Such a plane is horizontal if<img src="4-8500013\0db9dc27-7bcb-4de1-a72d-b3dda938bfed.jpg" />.</p></sec><sec id="s2_2"><title>2.2. Estimating the Coefficients of Regression by Redescending M-estimates</title><p>The regression coefficients are not estimated by classical least-squares because this method is not robust against outliers, which often occur in raw yield data. Instead, M-estimates are used.</p><sec id="s2_2_1"><title>2.2.1 The Concept of an M-estimate</title><p>To understand the concept of an M-estimate, consider first the simplest case of estimating a location parameter. A property of the mean <img src="4-8500013\8d1cdbb6-2574-4f3d-beae-bfb8145a924a.jpg" /> is that the sum of deviations from it is zero:<img src="4-8500013\86d09952-efd8-4955-b344-bb9fd4071953.jpg" />. This is demonstrated in <xref ref-type="fig" rid="fig2">Figure 2</xref>, where the bisecting line was moved back and forth until its deviations from the data points summed up to zero. Its intersection with the abscissa is the mean. <xref ref-type="fig" rid="fig2">Figure 2</xref> shows that the outlier to the right causes the mean to move to the right.</p><p>This effect can be avoided by bounding the bisecting line, so that the influence of any outlier is also bounded. This is illustrated in <xref ref-type="fig" rid="fig3">Figure 3</xref>, where the bisector is replaced by a bounded function, ψ.</p><p>The influence of the outlier is reduced when solving</p><p><img src="4-8500013\52c0b389-c98b-48f0-bd9f-9785845b8648.jpg" />. The solution of this equation,<img src="4-8500013\070a51a8-0d5a-4183-8e08-6259860f1a0f.jpg" /> , which is the intersection of the ψ-function and the abszissa, is called an M-estimate. <xref ref-type="fig" rid="fig3">Figure 3</xref> shows that if the largest value were shifted further to the right, it would have no influence, whereas the smallest value, which is not an outlier, retains its influence on the estimate. The latter would not apply if one computed a trimmed mean. An M-estimate whose ψ-function redescends to zero is called a redescending M-estimate, which removes the effect of large outliers completely.</p><p>Since the dispersion of the data is usually unknown, an M-estimate should be scale-invariant, which is so when a scale estimate is incorporated into the equation to be solved. The M-estimate for the location parameter is then defined by<img src="4-8500013\24dd161a-a4e3-4715-8b7d-cb4da18c65de.jpg" />. The scale estimate<img src="4-8500013\85bc3d1f-fdd9-44a2-817b-fa5560fddb1f.jpg" /> can also be determined as an M-estimate for scale if one simultaneously solves an additional equation for<img src="4-8500013\e64aa106-e4b6-44c7-a0ed-3c7883cc36f0.jpg" />. This will be done in the following, where M-estimates are expanded to a regression model.</p></sec><sec id="s2_2_2"><title>2.2.2. Weighted M-estimates in the Regression Model</title><p>The objective is to estimate simultaneously the regression coefficients <img src="4-8500013\32702d9f-a3a1-4bd6-a432-61773aa0407e.jpg" /> and the scale parameter <img src="4-8500013\4eb556d6-1c55-45de-a0e0-1e42b1c60217.jpg" /> in (1) and (3) by robust weighted M-estimators. Thus, <img src="4-8500013\418f3aab-91a2-4f58-b826-4d0e15befb0a.jpg" />equations have to be solved, where <img src="4-8500013\5ceba9d4-2c19-4fdd-a134-e2b99ae2ed44.jpg" /> is the number of regression coefficients (<img src="4-8500013\c0ac9554-9a30-43de-b0b0-9368c07eec0d.jpg" />if a skewed plane is modeled, <img src="4-8500013\c4992326-6bc1-44c5-a754-8ff673d9caab.jpg" />if a paraboloid cone is modeled). The solutions<img src="4-8500013\77b1a2a0-70b9-43ca-b0b1-da777e83549a.jpg" />, <img src="4-8500013\81606420-7a78-459c-b0ec-ceb6760a6c3e.jpg" />, and <img src="4-8500013\0eb1ba81-3efe-440d-868e-38511eba3c9d.jpg" /> are the M-estimates for the unknown parameters <img src="4-8500013\0aa943cd-6e87-439b-8def-63bf8cbdf8f5.jpg" /> and<img src="4-8500013\6062eae9-ceb8-4da4-9030-02973dec852e.jpg" />. Following Huber [<xref ref-type="bibr" rid="scirp.3161-ref12">12</xref>] with slight changes for the scale estimate, the following system of equations is solved:</p><p><img src="4-8500013\b154ffa7-0155-4a1f-9ee7-20d1f04481ad.jpg" />(4)</p><p>where the three equations of the right column in (4) are omitted for skewed planes (<img src="4-8500013\13cb0f8a-03f5-420b-945f-21060fc992e8.jpg" />), and the residuals <img src="4-8500013\5599d573-12a0-4649-8e67-d37338e72fd1.jpg" /> result as:</p><p><img src="4-8500013\a159a18f-c0a0-465d-9821-bcc7b88a5829.jpg" />(5)</p><p>the weights <img src="4-8500013\c664de89-e6e5-48a3-84fe-7bea96e49345.jpg" /> will be defined in (23). The number</p><disp-formula id="scirp.3161-formula102488"><label>(6)</label><graphic position="anchor" xlink:href="4-8500013\1e913197-ac44-4709-9e1f-fed7e6e1826f.jpg"  xlink:type="simple"/></disp-formula><p>(Satterthwaite [<xref ref-type="bibr" rid="scirp.3161-ref13">13</xref>]) in the last line of (4) refers to them. I call it the effective number of data points. The idea behind this can simply be explained for the mean as a special case of linear regression. The variance of the mean of independent random variables <img src="4-8500013\14c53e3d-fe91-4e2d-a6a1-66f8b92d4215.jpg" /> with identical variance<img src="4-8500013\5e092e73-fa34-433a-9089-dda156d5ecff.jpg" />is given by<img src="4-8500013\ae3c1546-2bcb-4306-b540-7d7764c27c62.jpg" />, but the variance of a weighted mean,<img src="4-8500013\7c02831a-af9c-4b28-a8ce-e6f305ea11b3.jpg" /> , where</p><disp-formula id="scirp.3161-formula102489"><label>(7)</label><graphic position="anchor" xlink:href="4-8500013\0f5c79cc-8514-4933-a682-0a32e0992a6b.jpg"  xlink:type="simple"/></disp-formula><p>results in<img src="4-8500013\b1919bfe-9eaa-4484-965f-881b6cc57a1d.jpg" />. Thus, <img src="4-8500013\7d3bf6d6-23ee-4d92-8b01-0840e2c61e93.jpg" />data weighted according to <img src="4-8500013\a83f1935-bae5-4ca9-94b7-62b0b097f6c7.jpg" /> lead to the same variance reduction as <img src="4-8500013\0ef4f701-55c5-4a8d-8b97-b02072cf6587.jpg" /> unweighted data. If all weights were equal,<img src="4-8500013\6e454cbf-7485-4a87-97a1-ad39cc2c1afc.jpg" />. The effective number <img src="4-8500013\5ff1b197-1d31-41cb-b379-5a37df4ba068.jpg" /> is greater the more the weights <img src="4-8500013\75ba6138-b3ca-44e5-8060-2b1b02b2ff55.jpg" /> equal each other. It is never greater than<img src="4-8500013\05846a8e-595f-44e5-bbe7-503f3478fa58.jpg" />. Data points with a low weight, <img src="4-8500013\97405bd5-b5ad-4b13-bf6a-ecf6a59547f3.jpg" />, such as those close to the border of the selected neighborhood, barely enlarge<img src="4-8500013\87b9e179-944c-488a-874e-a895297f0d30.jpg" />, unless there are no data points close to the point to be mapped.</p><p>The weighted M-estimates used for the regression and</p><p>coefficients, <img src="4-8500013\fc0cef68-ffab-4b3b-9362-0592155e8119.jpg" />, are based on the <img src="4-8500013\f5b59073-9613-4eb1-bd13-fe3c9a1ba6ed.jpg" />-function in (8) and the <img src="4-8500013\1ed615a2-fa7d-494d-954f-ebf0636c6d2d.jpg" />-function in (9). Illustrated are these functions in <xref ref-type="fig" rid="fig4">Figure 4</xref> and <xref ref-type="fig" rid="fig5">Figure 5</xref>. Both <img src="4-8500013\8ba164c2-2015-45ef-b3bd-ba235d666b0e.jpg" /> and <img src="4-8500013\a0e7397f-5aaf-4225-b517-1ab31caf4d8d.jpg" /> consist of lines and parabolae only. <img src="4-8500013\24ec85f2-9489-400f-8d3c-06fec2a0692f.jpg" />is defined as</p><p><img src="4-8500013\f8ce6e7d-926a-4f18-9d05-ce541e1ef680.jpg" />(8)</p><p>and <img src="4-8500013\510eb0da-ea99-4f6e-b8c5-78f805309bfb.jpg" /> is defined as</p><disp-formula id="scirp.3161-formula102490"><label>(9)</label><graphic position="anchor" xlink:href="4-8500013\da9ec0ae-ce2f-41a5-82b2-20f43b581baf.jpg"  xlink:type="simple"/></disp-formula><p>where</p><p><img src="4-8500013\7ab74231-cc23-492e-9ea6-d5e8122f6716.jpg" />(10)</p><p><img src="4-8500013\29e1ac06-89a6-411d-b8e7-4103b8cbb91c.jpg" />ist the expected value of <img src="4-8500013\f7f7964c-d362-484a-88ec-a290bcae21d6.jpg" /> for an <img src="4-8500013\d6432789-d161-4c94-bded-dc66ca34be9a.jpg" /> distributed random variable<img src="4-8500013\3bb34d6b-6310-4fda-90e9-3d8379bea9a7.jpg" />, thus providing the scale parameter <img src="4-8500013\700a0fda-ed21-458d-a2b9-c682a4ca1918.jpg" /> for the standard normal distribution.</p></sec><sec id="s2_2_3"><title>2.2.3 An Iterative Procedure to Obtain Regression Coefficients and Scale Estimate</title><p>To obtain the regression coefficients<img src="4-8500013\edcdca96-da88-4954-846c-8d561dc18517.jpg" />, <img src="4-8500013\6f94c849-e117-4214-a500-c66c0a64e998.jpg" />, and the scale estimate<img src="4-8500013\f445566d-3174-4a31-b227-1f75fc8702b0.jpg" />, the equation system in (4) is solved in the following three successive steps:</p></sec><sec id="s2_2_4"><title>First step: starting values</title><p>The starting value for the column vector of regression coefficients is obtained with the method of weighted least-squares:</p><disp-formula id="scirp.3161-formula102491"><label>(11)</label><graphic position="anchor" xlink:href="4-8500013\811e6d31-fcd4-4243-bdb6-d77527dbfb1b.jpg"  xlink:type="simple"/></disp-formula><p>where the <img src="4-8500013\3026da17-4157-436e-a835-d8f2c7a05682.jpg" />-matrix</p><disp-formula id="scirp.3161-formula102492"><label>(12)</label><graphic position="anchor" xlink:href="4-8500013\5f165a80-cdfc-422b-9b09-8943a6b8d233.jpg"  xlink:type="simple"/></disp-formula><p>is the design matrix whose i<sup>th</sup> line (<img src="4-8500013\84d4258f-e211-44e7-ba2e-e1ba3b6711db.jpg" />) consists of the line vector.</p><p><img src="4-8500013\578126c8-742d-4089-b584-d1991f6b223c.jpg" />(13)</p><p>which is based on the mean adjusted Gauss-Kr&#252;ger coordinates x<sub>i</sub> and y<sub>i</sub> in (2). The weight matrix W = diag(w) is a diagonal matrix, i.e. <img src="4-8500013\7c1dd8d6-cfe0-42aa-b531-31e10093042a.jpg" />and <img src="4-8500013\69f16334-a6d5-4262-baf7-cd638d5b252a.jpg" /> for<img src="4-8500013\890b81a7-a89e-493b-b756-7fb577ef317b.jpg" />.</p><p>The starting value for the scale solution is</p><disp-formula id="scirp.3161-formula102493"><label>(14)</label><graphic position="anchor" xlink:href="4-8500013\49d914f5-aa5a-4a1f-9f5c-7eb64e1ede99.jpg"  xlink:type="simple"/></disp-formula><p>with <img src="4-8500013\ada39d6f-01f1-4ea8-888b-5453389520fd.jpg" /> according to (7).<img src="4-8500013\b998df39-4888-42ea-a37a-bcc20bc85043.jpg" />, which has been defined in (6), should be considerably greater than<img src="4-8500013\8bb76241-8590-4e6f-ba79-e6b9dceb1d18.jpg" />. Since <img src="4-8500013\905c1ae6-7f4c-4cb5-98b3-5d4ae0676bf7.jpg" /> is based on squared deviations, it is subject to outliers, so that the robustly defined <img src="4-8500013\9cb9454d-56ec-4722-aab6-8f659c795129.jpg" /> might be overestimated. However, large starting values for scale should be preferred as they counteract the danger of scale implosion (Huber [<xref ref-type="bibr" rid="scirp.3161-ref14">14</xref>]).</p></sec><sec id="s2_2_5"><title>Second step: iteration based on <img src="4-8500013\1a958e9e-7a8c-47e9-b62a-0ef586ab4533.jpg" /> and <img src="4-8500013\598f67b2-60fb-4bb3-9601-47b0dc3f7c01.jpg" /></title><p>The classical starting values obtained in the first step, <img src="4-8500013\ec017166-e3e5-4ef6-9e45-4cb6ba926b47.jpg" />in (11) and <img src="4-8500013\01e912cf-67d5-4a6c-8270-927502e2d6c4.jpg" /> in (14), serve as the results of the iteration <img src="4-8500013\bef09696-5219-4a63-a319-81918705fafe.jpg" /> in the second step.</p><p>To avoid solutions that could abduct us from the bulk of the data, redescending M-estimates are not yet applied in the second step when running the iteration procedure, which will be described soon. Instead, the <img src="4-8500013\b28e3ad9-6340-41a4-8daa-2e3cf53cc109.jpg" />-function <img src="4-8500013\d585568f-7015-4599-bfd1-8cefa74cc177.jpg" /> in (8) is made monotone by replacing its redescending parts by horizontal lines:</p><disp-formula id="scirp.3161-formula102494"><label>(15)</label><graphic position="anchor" xlink:href="4-8500013\006c9707-6f3b-4b05-9e69-b58d07d2cb8d.jpg"  xlink:type="simple"/></disp-formula><p>The<img src="4-8500013\93d79d46-06e8-4df0-895b-4c63b47b5ef2.jpg" />-function remains unchanged, as it is already monotone in<img src="4-8500013\68d1ae3a-5376-4c3d-a1ab-fc7d79a26b8f.jpg" />.</p></sec><sec id="s2_2_6"><title>Third step: iteration based on <img src="4-8500013\dc4dd389-caab-4a5c-870c-05fd97a03d31.jpg" /> and <img src="4-8500013\b33bfd94-9be0-493d-9160-03b78ecbe016.jpg" /></title><p>Finally, the same iteration procedure is run again, using the results of the second step as starting values (iteration<img src="4-8500013\072e84f5-6820-4340-8793-e39b2bb560c6.jpg" />) and applying the redescending <img src="4-8500013\e294111a-afb9-4f39-a080-57cfaba6189e.jpg" />-function <img src="4-8500013\6afb4f30-a3b7-4a8f-a451-70c89f527550.jpg" /> in (8) instead of the monotone one in (15). The results of the last iteration, <img src="4-8500013\f082fa11-a51d-4c2e-b39b-3eb3ab10bb81.jpg" />and<img src="4-8500013\2666b98b-5d7b-4ba6-9fbd-6e40ee0d43f1.jpg" />, are the numerical solutions,<img src="4-8500013\09343cf5-16a5-4b3d-bc9d-87030d7d2f2c.jpg" /> and<img src="4-8500013\ec27350e-9d7b-454a-8d5a-72d57d96dbff.jpg" />, of the equation system in (4).</p></sec><sec id="s2_2_7"><title>The iteration procedure</title><p>The iteration procedure follows Huber [<xref ref-type="bibr" rid="scirp.3161-ref14">14</xref>] or Huber [<xref ref-type="bibr" rid="scirp.3161-ref12">12</xref>], Section 7.8. However, contrary to Huber, the M-estimates<img src="4-8500013\5dfc3026-b94c-49d8-843c-5fbf30375cf5.jpg" />,<img src="4-8500013\9d25d384-87ec-43b2-ad21-a0ad9bd6c9ce.jpg" /> , and <img src="4-8500013\55460b9e-b72a-464e-9ced-293850d9f4a1.jpg" /> in (4) are weighted, and the definition of <img src="4-8500013\383abc71-3bc6-4c74-9ef2-eed7f30dab01.jpg" /> slightly differs from that of Huber. Therefore, the pocedure needs to be changed accordingly.</p><p>It suffices to describe the (m + 1)<sup>th</sup> iteration on the basis of the results of the m<sup>th</sup> iteration (<img src="4-8500013\dd2a498d-f7ac-4b0e-bf9b-7823ad2d4f17.jpg" />):</p><p>Based on the vector of regression coefficients <img src="4-8500013\18c947b5-fa12-4d60-9bf1-f2d4f14e0b32.jpg" /> of the m<sup>th</sup> iteration, the residuals are computed first:</p><disp-formula id="scirp.3161-formula102495"><label>(16)</label><graphic position="anchor" xlink:href="4-8500013\5b228f6a-a1a9-4f6e-896b-234977d7ad44.jpg"  xlink:type="simple"/></disp-formula><p>Then the <img src="4-8500013\e8c9bebe-dde9-4d5d-bd36-c5bd0bfc6548.jpg" />-estimate of iteration <img src="4-8500013\b3b05eb1-effe-4e2f-a41f-b670016795c3.jpg" /> is calculated:</p><p><img src="4-8500013\30e0c2ac-075e-43e2-bd93-35f566123f68.jpg" />(17)</p><p>with <img src="4-8500013\414f1f0f-25f7-42c1-9853-316dc57cd6b2.jpg" /> according to (7). Now, the residuals <img src="4-8500013\e7eb7049-2d4d-46ae-a1d0-b2e214ad66da.jpg" /> are used to compute the so-called Winsorized residuals:</p><disp-formula id="scirp.3161-formula102496"><label>(18)</label><graphic position="anchor" xlink:href="4-8500013\ee247670-842a-48d7-b273-0a91533c1121.jpg"  xlink:type="simple"/></disp-formula><p>where <img src="4-8500013\28fa92a1-2ea7-4b10-a200-36a0a8daa64c.jpg" /> in (15) if the iteration refers to the second step, and <img src="4-8500013\e6d7ee9f-1ee4-439d-b3e1-bfc13716cc39.jpg" /> in (8) if it concerns the third step.</p><p>Using the vector of these Winsorized residuals, <img src="4-8500013\5635441f-d784-4e23-a7a0-ccfd95351928.jpg" />, the difference vector</p><disp-formula id="scirp.3161-formula102497"><label>(19)</label><graphic position="anchor" xlink:href="4-8500013\54c4670f-66a4-4d8c-bc2f-9ff6bf307a14.jpg"  xlink:type="simple"/></disp-formula><p>is calculated by weighted least-squares and used to compute the new vector of coefficients of regression:</p><disp-formula id="scirp.3161-formula102498"><label>(20)</label><graphic position="anchor" xlink:href="4-8500013\9679d76d-5fbf-4642-b975-e8fbb82debde.jpg"  xlink:type="simple"/></disp-formula><p>According to Huber [<xref ref-type="bibr" rid="scirp.3161-ref12">12</xref>], p. 183, theoretical considerations suggest<img src="4-8500013\9b469ed1-3cd2-4df7-84ac-d08eab5a7cdb.jpg" />. For<img src="4-8500013\38b8169a-f094-4ccf-832a-3847be12b7d4.jpg" /> distributed<img src="4-8500013\7fa66bf3-515d-4df8-96d0-5fbb38577b31.jpg" />, this yields <img src="4-8500013\ecfdcbb2-affc-483a-a87d-908c311b98e7.jpg" /> if<img src="4-8500013\f496690f-fcdc-48b5-bb81-53f27bc17ec8.jpg" />, and <img src="4-8500013\1a2e4f27-0987-4ee6-958a-6493e6fb5c56.jpg" /> if<img src="4-8500013\cdb0caf7-2d57-4f2b-b94f-0710baaa87e3.jpg" />. Empirical results showed that the factor <img src="4-8500013\814c2842-1e20-4ae2-a2ac-fb926e4d1dca.jpg" /> could be relaxed, but only because the <img src="4-8500013\f2869cec-8ec5-4c2a-bcee-386c5fbc6c87.jpg" />-functions used are normed such that <img src="4-8500013\c759627e-0b3d-4c29-be81-9ebf1dccc677.jpg" /> on a large area around 0.</p><p>The iteration procedure ends if</p><p><img src="4-8500013\e4a48e51-fea8-44ca-97e1-c96ff364c701.jpg" /></p><disp-formula id="scirp.3161-formula102499"><label>(21)</label><graphic position="anchor" xlink:href="4-8500013\41eb6f23-cf77-4f2d-8376-402c22444260.jpg"  xlink:type="simple"/></disp-formula><p>for a small<img src="4-8500013\57b7c79b-db32-418c-9c00-1921b4eda8e8.jpg" />, e.g., <img src="4-8500013\b5727efc-33f9-452f-80fb-3d6d0c5ec9ac.jpg" />in the second and <img src="4-8500013\fddebf03-ac2f-43bf-98cf-7c72b0ce47dc.jpg" /> in the third step; <img src="4-8500013\860ec059-06e9-4e8a-b4db-8561d52a34ab.jpg" />is the j <sup>th</sup> diagonal element of<img src="4-8500013\618bd61b-05fe-4b62-82ab-a2876723244e.jpg" />.</p></sec></sec><sec id="s2_3"><title>2.3. Weights</title><sec id="s2_3_1"><title>2.3.1. Global weights</title><p>The most important reason for omitting yield monitor data concerns start-path delays (see e.g. Thyl&#233;n et al. [<xref ref-type="bibr" rid="scirp.3161-ref7">7</xref>], Simbahan et al. [<xref ref-type="bibr" rid="scirp.3161-ref2">2</xref>]). The first <img src="4-8500013\07c1c229-8367-477d-9a6c-5d0ef9e1042b.jpg" /> measurements are given the global weight zero, which discards them; the (m + 1)<sup>th</sup> measured value is downweighted by a factor of 0.5 because of uncertainty as to whether this value is correct. The other values are then weighted fully in the context of delays. The size of <img src="4-8500013\6aa8c259-3a26-4e2c-a0c7-ddde637ae24b.jpg" /> depends on the yield monitor. For the Claas Agrocom Quantimeter, which was used in the trial described here, <img src="4-8500013\5f3fd3e8-b946-4747-8d11-5b49e75c3870.jpg" />proved adequate.</p><p>Another reason for downweighting values is if the GPS points of the current swath are close to a preceding neighboring swath. Such a small distance might arise from GPS errors or inaccuracies, but it can also be the consequence of a smaller current swath width. If the minimum distance to the preceding harvest tracks,<img src="4-8500013\6ca88d23-b717-4f20-87a8-664c9bf90e35.jpg" /> , is small, it is less likely that the measured yield comes from a full swath width. At present, a raw yield value is weighted fully (weight 1) only if<img src="4-8500013\a77b91d6-bfaa-426b-8869-4677d724204d.jpg" />, whereby sw means the “full” swath width, which is somewhat smaller than the combine’s cutting width, cw, when the combine is steered by a man. If<img src="4-8500013\22f9882d-1dda-4b3b-8a3c-cd1a00f040ed.jpg" />, the raw yield value gets weight zero. The weights in between are obtained by linear interpolation. This is illustrated in <xref ref-type="fig" rid="fig6">Figure 6</xref>.</p><p>In Bachmaier [<xref ref-type="bibr" rid="scirp.3161-ref1">1</xref>], the interpolation interval was <img src="4-8500013\86f69c2a-f89e-4d2e-a3ce-94b3ca6815bf.jpg" /> instead of<img src="4-8500013\2d873a9b-e9b8-461f-9cfb-1257155fd714.jpg" />, whereby I was influenced by researchers who discarded yield measurements with<img src="4-8500013\53bdbb09-a03d-4746-b4b1-6ee18dae4fa7.jpg" />. I no longer defend this because the positioning system’s accuracy, which is the main source of variation of<img src="4-8500013\b146967d-75f0-45fa-a4fa-74943598435a.jpg" />, does not depend on the combine’s cutting width, cw, to which the full width of a swath, sw, is closely related. However, if positioning errors can be assumed to be approximately normally dis tributed, the left half of a Gaussian bell curve,<img src="4-8500013\1df8c9e8-697b-4521-a88f-5cd641855245.jpg" /> , could give a still more suitable downweight function for<img src="4-8500013\1f87b51a-d086-4257-8e55-a419ebf92bc0.jpg" />. For,</p><p><img src="4-8500013\41c527d0-b66d-4282-ad19-20e98385bf7e.jpg" />, it comes close to the function in <xref ref-type="fig" rid="fig6">Figure 6</xref>.</p><p>The global weight,<img src="4-8500013\9a350146-0502-4738-b9d9-bdbb1bcd65af.jpg" /> , is the product of the single weight factors. For example, if the measurement is the <img src="4-8500013\41f4f20b-d7e5-4e3f-82dd-f87d08c9167d.jpg" /> value of a harvest track, it is downweighted by a factor of 0.5, as mentioned above, and if its minimal distance from neighboring tracks,<img src="4-8500013\16cb6d4a-a442-4d37-99e9-2edb5e554fd4.jpg" /> , is<img src="4-8500013\93e0181d-0272-4254-8bae-218a30a193db.jpg" />, it results in a weight factor of 0.75 (<xref ref-type="fig" rid="fig6">Figure 6</xref>). Finally, the yield at this point is given the global weight<img src="4-8500013\b46242dc-4dce-481f-bb97-37d9157dc90f.jpg" />.</p><p>Other sources of error that result in dubious measurements, e.g. unusual values for moisture or speed, could be dealt with similarly. The greater the likelihood that the measurement is erroneous, the lower its weight is.</p></sec><sec id="s2_3_2"><title>2.3.2. Local Weights Based an Elliptical Distance from <img src="4-8500013\511990fd-9fa0-4373-b0a6-59dcd1a913a1.jpg" /></title><p>To estimate the regression coefficients for the paraboloid cone or the plane at the point<img src="4-8500013\2710a019-605b-455c-b731-a6d51ac1a0a1.jpg" />, all points <img src="4-8500013\49fcecc0-b493-4517-9395-55f3692fcdf3.jpg" /> within an elliptical neighborhood see <xref ref-type="fig" rid="fig7">Figure 7</xref> are considered. The local weight 1 holds only at<img src="4-8500013\0d53246a-47f7-4646-8467-e095b22cc70f.jpg" />, whereas the weight is zero at the boundary of the selected neighborhood. The transition in between is obtained by quadratic interpolation, because a local deviation corresponds to a position error, and errors are usually quadratically considered. In particular, the local weight function is defined as</p><disp-formula id="scirp.3161-formula102500"><label>(22)</label><graphic position="anchor" xlink:href="4-8500013\6a2eed53-be77-4eb9-8da7-76c0ca5558ff.jpg"  xlink:type="simple"/></disp-formula><p>where <img src="4-8500013\27fec3c4-5bab-42dd-b50b-53410b6f19f5.jpg" /> denotes the radius of the neighborhood across the tracks and <img src="4-8500013\2373b451-ce6e-4171-8046-cdce35130e87.jpg" /> is a distance measure between <img src="4-8500013\e92723bf-aa3e-4687-b975-0955980af4c5.jpg" /> and <img src="4-8500013\4f438944-b2c1-4e62-bf8a-235c4731226a.jpg" /> on the basis of an elliptical metric, which will be defined in (24). The weights <img src="4-8500013\5916bd0f-9ad5-4bcf-8208-c2f52f9ec161.jpg" /> are local because they depend on the distance from the fixed point<img src="4-8500013\d32f37a9-2ff2-4ece-947d-a52046148e2d.jpg" />, and they change if this point changes.</p><p>The global weights,<img src="4-8500013\d0e68a00-7fb9-4088-a019-89c3c2e46728.jpg" /> , do not depend on<img src="4-8500013\0c5d1100-1073-4ddd-ad1f-fc624f019246.jpg" />. The final weights,<img src="4-8500013\a9830947-2c6e-4014-a28f-baa05321dee8.jpg" /> , of the yield measurements at the points <img src="4-8500013\bceee27f-d31d-4820-a780-c7393e58bda7.jpg" /> within the neighborhood for determining the regression coefficients in (4) are the products of the global and local weights, so they also depend on<img src="4-8500013\f321e247-cfe3-4d18-afe4-e46068aae993.jpg" />:</p><disp-formula id="scirp.3161-formula102501"><label>(23)</label><graphic position="anchor" xlink:href="4-8500013\4b49d1d9-d748-4b7b-95cb-91cef377974e.jpg"  xlink:type="simple"/></disp-formula><p>In Bachmaier [<xref ref-type="bibr" rid="scirp.3161-ref1">1</xref>] I used different local weight functions, <img src="4-8500013\b17d1996-54a1-47f5-b09c-2e467f7788ea.jpg" />and<img src="4-8500013\a4ec824c-1ca0-471a-9c6a-b20cefd3316a.jpg" />, and thus, different weights, <img src="4-8500013\3d65be42-e0aa-4ee7-8dd3-1fc6f498c9b7.jpg" />and<img src="4-8500013\3da09aa0-88f5-4690-b3f3-de81cd61e960.jpg" />, in the equation system in (4), where the former served to estimate the regression coefficients <img src="4-8500013\352152b1-6187-4355-b82c-43279fdef036.jpg" /> by means of the <img src="4-8500013\9956c8c4-07a0-4000-bf36-6bfbef08f579.jpg" />-function, and the latter were used to estimate <img src="4-8500013\760bde16-e921-43d3-a82a-4beaf253201b.jpg" /> by means of the <img src="4-8500013\b0005f9d-0642-4f24-8e1f-8400f549a8eb.jpg" />-function. The weight function <img src="4-8500013\8ef89f3a-3fdf-44a1-ac15-604713f83715.jpg" /> decreased linearly from 1 to 0, whereas <img src="4-8500013\86df1417-0ca1-42da-890d-1191b875f008.jpg" /> downweighted more slowly (with the fourth power) to increase the effective number of data points with respect to the scale estimate, because higher moments — the scale estimate is a second moment — require more data to be efficient enough. However, since the weights<img src="4-8500013\6ba5b307-9329-4ac2-bccc-552351948d8a.jpg" />, which were used to estimate the regression coefficients, decreased faster to 0, the paraboloid cone was less forced to adapt to yield values close to the neighborhood’s border, and hence, larger residuals could arise that distort the scale estimate. Therefore, and for the sake of simplicity, I now use the same local weight function, the quadratically decreasing <img src="4-8500013\a67feb47-4869-4912-a05c-5addaea6e368.jpg" /> in (22) <xref ref-type="fig" rid="fig7">Figure 7</xref>, and thus the same weights <img src="4-8500013\bfafa377-8657-4eff-adfd-4bac861bb8cf.jpg" /> for estimating both<img src="4-8500013\4e45a239-bdb2-4c79-a4ba-f7afc8c89c3b.jpg" />, <img src="4-8500013\8eeaa897-5f0b-496b-aa57-6048fbab3dc6.jpg" />, and<img src="4-8500013\bf754894-d968-4cd6-bbf3-5534d8459207.jpg" />.</p></sec></sec><sec id="s2_4"><title>2.4. An Elliptical Neighborhood</title><p>Raw yield data often show surface textures along the harvest tracks that remain when the data are amended. In Bachmaier and Auernhammer [<xref ref-type="bibr" rid="scirp.3161-ref15">15</xref>] I showed that ordinary kriging of amended data did not smooth out the harvest tracks sufficiently. Therefore, the neighborhood on which the paraboloid yield surface is fitted should be wider across the harvest tracks than it is along them. This can be achieved by an elliptical neighborhood <xref ref-type="fig" rid="fig8">Figure 8</xref>.</p><p>The butterfly neighborhood I used in Bachmaier [<xref ref-type="bibr" rid="scirp.3161-ref1">1</xref>] is more appropriate as a filtering technique (Bachmaier and Auernhammer [<xref ref-type="bibr" rid="scirp.3161-ref11">11</xref>]), because its length along the tracks is a little shorter at the current track than it is at the two adjacent tracks, and the assessment of a yield value (correct or not) should less be affected by values of the same track; they could be erroneous as well, for example if the swath has not its full width.</p><p>A single swath of wrong values can also distort a yield estimate if its influence on the paraboloid regression is</p><p>too large. To avoid this, the neighborhood must be chosen large enough, and the tracks in the mid should have nearly the same sum of weights<img src="4-8500013\e95be60e-bd7e-4627-8976-628a3864ecaf.jpg" />. This is a further reason why the chosen local weight function <img src="4-8500013\8debcbb0-bfef-4809-b009-269f27a45411.jpg" /> defined on an elliptical neighborhood downweights more slowly than linearly.</p><p>The ‘elliptical metric’, <img src="4-8500013\d9612a48-ac1b-499f-8ac1-704676754d08.jpg" />, is based on the ratio, <img src="4-8500013\83f9a5d5-7322-4c4b-b936-353ac88e5b43.jpg" />, of the ‘radius’ (half diameter) of the ellipse across the harvest track and the ‘radius’ along it:</p><disp-formula id="scirp.3161-formula102502"><label>(24)</label><graphic position="anchor" xlink:href="4-8500013\d9e6c2a0-cb6e-4f3c-a3a1-dceb4d9b613f.jpg"  xlink:type="simple"/></disp-formula><p>The components of the difference vector</p><p><img src="4-8500013\bb029f08-11d0-41d5-b884-0a6b5178ae2e.jpg" />along and across the harvest tracks, <img src="4-8500013\26662616-7001-45df-9222-2e0886e0d417.jpg" />and<img src="4-8500013\1d87ac86-3690-4faf-95b5-d9458b6aa9d7.jpg" />, can be calculated from neighboring GPS points on the same harvest tracks as follows:</p><p><img src="4-8500013\14eeacf8-1f94-489a-bc56-4c2eaf31c7a2.jpg" /></p><p><img src="4-8500013\75a8611c-87be-4d83-93a2-7a8862653853.jpg" />(25)</p></sec><sec id="s2_5"><title>2.5. Rules for Deciding on the Model and the Elliptical Neighborhood’s Size and Shape</title><sec id="s2_5_1"><title>2.5.1. Advantages and Disadvantages of Modeling Planes and Paraboloid Cones</title><p>The main advantage of modeling paraboloid cones over planes relates to yield peaks or depressions see <xref ref-type="fig" rid="fig1">Figure 1</xref>. If the yield is fitted by planes, a depression is overestimated and a peak is underestimated, whereas a paraboloid cone fits optimally provided that the neighborhood is not too large. Nevertheless, modeling paraboloid cones can lead to exaggerated results in the case of extrapolation. This can occur at the beginning of a swath width and along the edges of a field, and extrapolation can be based on points far away from<img src="4-8500013\c5b5f0ce-1cc4-45c5-b828-239c1eaf61bc.jpg" />. Therefore, where the extrapolation is over a distance or where there are few measurements near to<img src="4-8500013\2241e130-580b-452e-b9e6-b9ebbadba4a5.jpg" />, the skewed plane should be used.</p></sec><sec id="s2_5_2"><title>2.5.2. The Measure<img src="4-8500013\e627ff94-9db8-41bb-9631-11f4c1efe2b4.jpg" /></title><p>The choice of model can be based on the ratio of the sum of weights within a narrower neighborhood around <img src="4-8500013\2be6dde8-cbb4-4af8-b0cc-97845372ebd3.jpg" /> to the sum of weights within the entire neighborhood. A small ratio indicates that there are only a few valid measurements close to<img src="4-8500013\55c20fa7-1dcc-49e3-86b1-775f42213073.jpg" />, so the desired fit is determined mainly by extrapolation. The paraboloid cone should be used only if this ratio exceeds a certain value.</p><p>The considerations that lead to the choice of an elliptical neighborhood also depend on a sufficiently large share of valid measurements within the nearer environment. If it is too small, a circular neighborhood should be used.</p><p>To ensure that the ratio changes smoothly, a function for a degree of membership in the narrower environment is used, which is 1 at the point<img src="4-8500013\dbbe2104-a10e-4cb1-87b3-d8ff9b0d2993.jpg" />. Likewise to the local weight function <img src="4-8500013\6b73ad5a-54f9-4989-8f86-be51c91b2286.jpg" /> in (22), it decreases quadratically to zero at the boundary of the neighborhood because the effect of a paraboloid extrapolation can also increase quadratically with the distance. This leads to the definition of the following ratio:</p><disp-formula id="scirp.3161-formula102503"><label>(26)</label><graphic position="anchor" xlink:href="4-8500013\2fcc46b2-d002-42b1-97c0-171348cd01dc.jpg"  xlink:type="simple"/></disp-formula><p>The sums relate to the weights of all points <img src="4-8500013\2fe6df2c-1bfc-43d2-af3d-640009514f17.jpg" /> within an initial elliptic neighborhood of<img src="4-8500013\3edf5b78-8699-436e-a341-8e1bf3d09064.jpg" />. An increase of its size according to the rule in (30) does not involve a new computation of<img src="4-8500013\949d6a4b-f8be-4ab9-8daf-09002eef384c.jpg" />.</p><p>To form an idea of the limits at which to model paraboloid yield surfaces or at which to use an elliptical neighborhood, the ratio of the expected values of the nominator and the denominator of<img src="4-8500013\cbd0c267-730d-4ada-aae7-25792319ab56.jpg" />is helpful. Under the assumption of continuously and uniformly dispersed data points with global weight<img src="4-8500013\f2daac03-edb3-4d7c-bee2-69ed0759391b.jpg" />, this ratio does not depend on the radii of the ellipse, so it can be calculated for a circle with radius 1:</p><disp-formula id="scirp.3161-formula102504"><label>. (27)</label><graphic position="anchor" xlink:href="4-8500013\b86bfe57-b3f0-417f-bcef-3a95c92d3994.jpg"  xlink:type="simple"/></disp-formula></sec><sec id="s2_5_3"><title>2.5.3. Determining the Proportion of the Radii of the Elliptical Neighborhood</title><p>Since monitor yield data often show surface textures along the harvest tracks, an elliptical neighborhood that is wider across the tracks than it is along them is preferred. The determination of an adequate radius proportion, <img src="4-8500013\f92bb7ac-a83b-4c6c-9f43-8a0d96555bd3.jpg" />, with statistical methods is not afforded in this article. Yield data obtained with the Claas Agrocom Quantimeter suggest that <img src="4-8500013\b80e8ae5-4a2a-499c-acd6-f62745234483.jpg" /> could be a good choice. However, if there are few values in the nearer environment of<img src="4-8500013\6b8f874d-54d0-4698-83ef-a2d195608c40.jpg" />, so that <img src="4-8500013\8f484b60-5a1a-4513-af99-375fcda2c6fd.jpg" /> in (26) is small, the neighborhood should not be shortened along the tracks, and a circular neighborhood, i.e. a local radius ratio of<img src="4-8500013\8dcc9c6f-f767-4f5b-b428-1e960c8554b2.jpg" />, should be used. Yet to determine this<img src="4-8500013\ebfc5348-1c13-4b37-8038-9b669b380499.jpg" />, an initial neighborhood is already necessary. It is based on the `full' radius ratio,<img src="4-8500013\b51803aa-59d1-44f0-90cc-760fa0d6bd61.jpg" /> , and on an initial radius<img src="4-8500013\5ef244c2-3790-44ac-adc4-7099124b6ef4.jpg" />, a proper value of which can be found using the method in Section 3. The initial radius along the tracks is then given by<img src="4-8500013\27c7a1fd-a2d4-4583-bb37-5e30a1a8cd70.jpg" />. Based on this neighborhood, an initial measure, <img src="4-8500013\b54a7504-03dd-460d-85a2-601d766ea3c1.jpg" />in (26), is computed to determine the local radius ratio<img src="4-8500013\bc53b8ef-8736-4b44-a712-895134a98d03.jpg" />. For<img src="4-8500013\7460fc4a-71e5-47f9-9858-f8bd126198fe.jpg" /> it is assigned its maximum value, <img src="4-8500013\ff45b127-9bce-465a-b973-10898773ca42.jpg" />, because 0.6 is already close to the ratio, <img src="4-8500013\210e21ad-416b-4417-9df3-66eacbef0c03.jpg" />, of expected values in (27). A circular neighborhood, i.e.<img src="4-8500013\68add088-b83e-4303-8f6a-08769b004997.jpg" />, is only used if<img src="4-8500013\0e4389db-0e86-4bf4-bb3c-5b233e08cab0.jpg" />, and for <img src="4-8500013\f89be035-3bc1-4875-ab6f-38976a501e2e.jpg" /> the local radius ratio, <img src="4-8500013\51dbdd4e-ce34-41b6-a8be-adf867df204e.jpg" />, is defined as a linear interpolation between 1 and<img src="4-8500013\255c8e11-4345-4956-a882-ce88dc99a331.jpg" />. This is summarized in the following definition:</p><disp-formula id="scirp.3161-formula102505"><label>(28)</label><graphic position="anchor" xlink:href="4-8500013\99f19daa-337a-4d80-914c-9069c430d001.jpg"  xlink:type="simple"/></disp-formula><p>Further computations of <img src="4-8500013\2ca9c339-067f-456e-a4e4-c480580866d2.jpg" /> are based on a neighborhood that already respects this local radius ratio,<img src="4-8500013\e4bbb205-5db6-441c-8d3b-4220439d8d9f.jpg" />. When deciding on the model, the radius <img src="4-8500013\57a3c5aa-b9b2-466a-a86d-71b8d427a8e3.jpg" /> of the ellipse remains unchanged, but the radius along the tracks is adapted to it: <img src="4-8500013\1852b30f-d6dd-4b13-920f-9a26332390d7.jpg" /> .This can enlarge the neighborhood, but it rarely occurs, unless corners of a field are harvested. The elliptical neighborhood with these radii, <img src="4-8500013\5ee6ddf7-828c-4cde-b9bc-f33d9f96c277.jpg" />and<img src="4-8500013\af008fdd-1364-4626-b479-57b6f0ff9784.jpg" />, also serves as the initial neighborhood in (30), where the final neighborhood size is determined. Its shape, however, i.e., its radius ratio, <img src="4-8500013\20306188-64e9-4cb4-847f-c51a02d1362f.jpg" />, remains unchanged, until the next point <img src="4-8500013\bdd63f30-c769-4862-851d-13f6c203e023.jpg" /> is mapped.</p></sec><sec id="s2_5_4"><title>2.5.4. The Decision on the Model</title><p>Paraboloid cones are preferred, therefore, the limit at which to model it should be less than the ratio in (27). The transition from planes to paraboloid cones should be smooth to obtain a continuous yield map. In Bachmaier [<xref ref-type="bibr" rid="scirp.3161-ref1">1</xref>], where both the local weight function and the degree of membership in the narrower environment decrease linearly, the ratio of expected values of nominator and denominator of <img src="4-8500013\6513175d-ff8e-4cab-8c14-903699f62067.jpg" /> resulted in 0.5, and a transition interval of (0.35,0.45) has proved adequate. Considering that this ratio has resulted in <img src="4-8500013\0062d3ac-8fe4-47c1-9028-8acd96f9eb8a.jpg" /> for weight and membership quadratically decreasing, a transition area of (0.5,0.6) appears appropriate. This leads to the following model decision in (29):</p><disp-formula id="scirp.3161-formula102506"><label>(29)</label><graphic position="anchor" xlink:href="4-8500013\bab23a6b-da7d-4259-8a31-9692bbb8b1f3.jpg"  xlink:type="simple"/></disp-formula><p>The measure<img src="4-8500013\714b6ebf-f695-4572-9200-63624b371ae2.jpg" />, which it refers to, is based on the elliptical neighborhood with radii<img src="4-8500013\a53f90fc-f8ef-40c2-9f1f-b1608c0e3731.jpg" />and<img src="4-8500013\8e1cc9af-649e-418e-b333-476499d72ea3.jpg" />.</p><p>Modeling only a horizontal plane (p = 1) for very small <img src="4-8500013\ae7691f7-bbad-4905-b7ee-2eaed11cd605.jpg" /> has proved inadequate because areas with only a few measurements occur mainly at corners of a field, where the yield usually tends to become lower. Instead, the neighborhood size is increased until <img src="4-8500013\d1894388-7424-4bb9-a473-2be4b4669b78.jpg" /> reaches a minimum value of 0.30, so that there are enough data to fit an appropriate skewed plane.</p></sec><sec id="s2_5_5"><title>2.5.5. Determining the Neighborhood’s Size</title><p>The neighborhood should also be enlarged if the effective number of data points,<img src="4-8500013\55f8b3b3-8a42-47f2-b846-e19bf8fa26a7.jpg" /> , is less than a minimum effective number,<img src="4-8500013\fd57d546-78dc-4f79-a9d1-d11769868e65.jpg" />. This often applies where the combine enters the harvest track because the corresponding measurements have a global weight of zero. The rule for increasing the neighborhood's radii is as follows:</p><disp-formula id="scirp.3161-formula102507"><label>(30)</label><graphic position="anchor" xlink:href="4-8500013\c9bf821f-112e-46ec-a12a-6632ff81550a.jpg"  xlink:type="simple"/></disp-formula><p>The proportion, <img src="4-8500013\853f26e1-68aa-49e9-8495-5864dc854d85.jpg" />, of <img src="4-8500013\6c404f04-073e-46a0-bc08-00081988af21.jpg" /> to<img src="4-8500013\2ca37e85-e67a-45c5-b519-578bb4d81e6c.jpg" />, as well as the choice of model is no longer affected by this procedure, as these decisions have already been made on the basis of <img src="4-8500013\3d498493-9048-45ff-8fa1-df84982f394c.jpg" /> and<img src="4-8500013\fa9f5a38-6d09-45bd-8d91-d9fbefa466e5.jpg" />.</p></sec></sec></sec><sec id="s3"><title>3. The Method of Finding an Adequate Neighborhood Size by Variance Comparison</title><p>If the variance of the true unknown yields of the field were known, the yield map could be considered optimal if its variance equaled that of the true yields. If it is less than the variance of the true yields, the map is too smooth, if it is greater, the map is too detailed. The variance of the true yields is unknown. It needs to be estimated in a way that does not depend on how the yield map has been generated.</p><p>The variance of the measured yields comprises the variance of the true yields and the error variance of the yield monitor. Hence, the required variance of the true unknown yields results in</p><disp-formula id="scirp.3161-formula102508"><label>(31)</label><graphic position="anchor" xlink:href="4-8500013\cdf411f3-d3de-438e-b2ea-06fef1817ce9.jpg"  xlink:type="simple"/></disp-formula><p>The error variance,<img src="4-8500013\8cb683e5-b35b-4503-8450-e772e913837b.jpg" /> , can be estimated by the nugget component of a variogram. The reason is that there is no microscale variation in the yields because, as defined in the introduction, any yield at a field point refers to the average yield on a rectangle to be harvested around it, so all the nugget effect arising in an empirical variogram reduces to the error variance, <img src="4-8500013\33832ba5-9a58-4b1b-b444-7be1dd569985.jpg" />, of the yield monitor (Cressie [<xref ref-type="bibr" rid="scirp.3161-ref16">16</xref>], p. 59). However, outliers often distort the structure of a variogram, which makes the determination of this value by extrapolation impossible unless outliers are removed or the variances are estimated robustly. Therefore, later I will switch to robust versions of variances and variograms. Nevertheless, for a better understanding it is desirable that the reader be familiar with the usual variogram, so I will introduce it first.</p><sec id="s3_1"><title>3.1. The Variogram</title><p>The empirical variance of measured values <img src="4-8500013\fc3bc087-7526-44c9-9f58-a43625da523a.jpg" /> can be given by the following two equivalent formualae; the latter is less well known.</p><disp-formula id="scirp.3161-formula102509"><label>(32)</label><graphic position="anchor" xlink:href="4-8500013\4a112a9d-08a1-4bfa-bec1-5b76bc5a4529.jpg"  xlink:type="simple"/></disp-formula><p><img src="4-8500013\b08bccb9-78f1-4078-b269-37250f87a69a.jpg" />is the number of summands in the latter formula, which expresses the variance as half the average of the squared differences between all pairs of values. Squared differences of yields usually increase if the spatial distance between the data points increases. This is expressed by a variogram, which gives the variance as a function of the separation distance,<img src="4-8500013\f3100082-d9ed-417a-9f2d-85e8c9d3d27e.jpg" /> , which is also called the lag. Empirical variances,<img src="4-8500013\a750f26f-26c4-47e2-9b9b-d9c3e654baf8.jpg" /> , depicted in a variogram are restricted to pairs of values that are separated by a given lag<img src="4-8500013\bad5e7f5-6cc8-470d-97e1-504209d4d564.jpg" />, so they can be called variances of the yield data points at given separation distance <img src="4-8500013\77ebd879-4d43-49b6-a33d-86af7055340d.jpg" /> (Bachmaier and Backes [<xref ref-type="bibr" rid="scirp.3161-ref17">17</xref>]):</p><p><img src="4-8500013\6d681345-d52e-414a-ac1a-1eb529f1e6a3.jpg" />(33)</p><p>This writing is as usual as misleading. One must be aware that the index <img src="4-8500013\d8a685ca-1321-43b5-9970-496ea5e0974c.jpg" /> in this sum is not a counter of certain logged locations (mean adjusted GPS points), but a counter of pairs of such locations. A single location can occur in many pairs, so that, for example, the third logged location point, which was denoted by <img src="4-8500013\841ff963-ceb7-4273-b2f2-7f78dc0d18bf.jpg" /> in Section 2, could now be indexed by</p><p><img src="4-8500013\9fc30a41-0917-4b38-a62e-ad8a13cd541a.jpg" />. The index <img src="4-8500013\4f2fb214-7812-4a3e-8703-e8ba2b0096cd.jpg" /> counts all pairs of those locations that are separated by a vector<img src="4-8500013\02a84e0f-25a1-46ed-bf38-aec67041025e.jpg" /> whose length,<img src="4-8500013\8e674137-f512-4f13-b323-9b3cf2456296.jpg" /> , is approximately equal to the target length<img src="4-8500013\98b29866-8007-46d1-b051-ffb85d32362c.jpg" />, i.e.<img src="4-8500013\df7fe077-6f3c-42f4-9bab-59ada4760d8b.jpg" />, where <img src="4-8500013\dda8c86c-a994-41ae-9c5a-19277f8f49ac.jpg" /> is the class width of distances used for computing a single variogram value (<img src="4-8500013\29191581-bb22-463e-b81d-06f31a52d910.jpg" />m in <xref ref-type="fig" rid="fig1">Figure 1</xref>2). <img src="4-8500013\df5ff109-6b30-4f86-b248-3cf106827176.jpg" />is the number of all these pairs.</p><p>The total variance of the measured yields in (32) is split into variances at given separation distance in (33), which in turn can be used to obtain the total variance as a weighted mean of them:</p><disp-formula id="scirp.3161-formula102510"><label>(34)</label><graphic position="anchor" xlink:href="4-8500013\f4ef3e53-1523-40cf-9686-b4a6c3fbe31c.jpg"  xlink:type="simple"/></disp-formula><p>where <img src="4-8500013\6cadd236-7074-47f5-b8a7-5320bc77d316.jpg" /> correspond to the target distances <img src="4-8500013\3663eb6e-c6b6-4514-92fd-e92d6cf9e76f.jpg" /> in (33). They are the midpoints of the different classes, whereas <img src="4-8500013\55ff5bfb-05c7-411d-9a7d-fd55c3b1bf58.jpg" /> are the averages of all distances within class<img src="4-8500013\d6a7076a-904d-4ece-9e9d-10ba42756ce7.jpg" />, i.e.</p><p>the averages of all <img src="4-8500013\cfa6085d-389c-4213-aeae-48606f95e975.jpg" /> with<img src="4-8500013\3d0c29cd-115d-47c5-8ab1-e6603f07b004.jpg" />. An analogue of the formula in (34) is needed to compute the total variance robustly.</p></sec><sec id="s3_2"><title>3.2. A Robust Variogram for Estimating the Variance of the True Yields</title><p>The variogram estimator <img src="4-8500013\a87b9b3e-fd9a-4014-b2fa-b5884d82ceb5.jpg" /> in (33) is based on squared differences among data, so it is sensitive to outlying yield data points. A single outlying datum can distort the variogram estimates since it occurs in several paired comparisons over many or all lag intervals (Lark [<xref ref-type="bibr" rid="scirp.3161-ref18">18</xref>]). Moreover, the outlier does not affect the variogram values of all lag intervals equally; it can distort the shape of the variogram, which affects the determination of the nugget component by extrapolation. Such distortion can be diminished by robust estimates of the variogram.</p><p>Robust variogram estimates have been introduced by several authors (Cressie and Hawkins [<xref ref-type="bibr" rid="scirp.3161-ref19">19</xref>], Dowd [<xref ref-type="bibr" rid="scirp.3161-ref20">20</xref>], Genton [<xref ref-type="bibr" rid="scirp.3161-ref21">21</xref>]). Lark [<xref ref-type="bibr" rid="scirp.3161-ref18">18</xref>] compared them with regard to robustness and efficiency. Since the formula of variance decomposition in (31) holds exactly only when referring to classical variances, I use a scale M-estimator whose <img src="4-8500013\bb8c3240-8ebd-4548-aa77-bd55f5e5910a.jpg" />-function equals the classical one,<img src="4-8500013\898e4e32-b7be-4e5b-b214-317b3e48bf8d.jpg" /> , if <img src="4-8500013\fb6eb5c6-ab2d-4ff0-8fbc-e4dc17b44bcf.jpg" /> where about 95% of standard normally distributed data are located. This guarantees a high efficiency for distributions close to normal (Bachmaier [<xref ref-type="bibr" rid="scirp.3161-ref22">22</xref>]). Outside the interval <img src="4-8500013\1984fade-52c6-4207-b029-352185ee659a.jpg" /> the chosen <img src="4-8500013\ae817a3c-c224-4e8b-8bd9-2810ba879f54.jpg" />-function begins to deviate, and from <img src="4-8500013\43de4c88-20c8-4bd8-8d20-7872100db7a9.jpg" /> it redescends until it reaches the value zero at <img src="4-8500013\848b08b2-c43a-4750-85ac-472b2c92c9b7.jpg" /> to exclude the effect of large outliers completely. It is illustrated in <xref ref-type="fig" rid="fig9">Figure 9</xref> and defined by parabolae and straight lines:</p><p><img src="4-8500013\a0927fd8-4107-45d3-9d9d-2bb0c2dda465.jpg" />(35)</p><p>The robust variances,<img src="4-8500013\22669a0b-998a-468c-81be-b5c64a3be076.jpg" /> , at a target separation distance,<img src="4-8500013\8361c1ac-908b-4612-a803-c74cabee0c4a.jpg" /> , are then defined by the solution of</p><disp-formula id="scirp.3161-formula102511"><label>(36)</label><graphic position="anchor" xlink:href="4-8500013\3fefd95d-4767-4e37-9e45-049f74eaff90.jpg"  xlink:type="simple"/></disp-formula><p>with<img src="4-8500013\e726031b-a78f-4b8a-befa-9351852873dc.jpg" />as in <xref ref-type="fig" rid="fig9">Figure 9</xref>, where the index <img src="4-8500013\dd8d282d-4fcc-4030-863e-0741eb995616.jpg" /> again counts pairs. Note that the solution of (36) would correspond exactly to the classical variogram estimate <img src="4-8500013\55a54c09-04f3-4906-9186-8fbce805d979.jpg" /> in Eq. (33) if one replaced the <img src="4-8500013\63b390fd-ea2a-4e3d-a2b9-3e611a2973d4.jpg" />-function in <xref ref-type="fig" rid="fig9">Figure 9</xref> by the classical <img src="4-8500013\00eb6620-13f4-48ff-8648-3cf35bfb0f4f.jpg" />-function,<img src="4-8500013\e38ec6f7-6ec3-474f-84e8-2b62b7209d49.jpg" />.</p><p>To compute the robust variogram for the yield monitor data according to (36), it was necessary to omit summands related to pairs <img src="4-8500013\bdcdfc43-a627-4e4a-8ced-530d2a98b74c.jpg" /> that are close to each other on the same harvest track. This ensures near independence of the remaining pairs<img src="4-8500013\abc282b9-b1a7-484a-ad36-d7b61fb598d8.jpg" />. Pairs where either <img src="4-8500013\6d14baff-2453-4216-bb0b-39014848030e.jpg" /> or <img src="4-8500013\ae32655e-a114-4b01-a4f9-ddffc2357647.jpg" /> has a global weight of zero were canceled also. Such a robust variogram is shown in <xref ref-type="fig" rid="fig1">Figure 1</xref>2.</p></sec><sec id="s3_3"><title>3.3. Estimating the Variance of the True Yields</title><p>The desired classical variance of the true unknown yields is now, analogous to (31), computed as the following difference of robust variances:</p><disp-formula id="scirp.3161-formula102512"><label>(37)</label><graphic position="anchor" xlink:href="4-8500013\4cdfd5e5-31a3-4fe1-a357-d39c4cd846e1.jpg"  xlink:type="simple"/></disp-formula><p>The robust error variance at separation distance zero,</p><disp-formula id="scirp.3161-formula102513"><label>(38)</label><graphic position="anchor" xlink:href="4-8500013\9b6d1041-6e4f-4537-9e24-184203dd43c3.jpg"  xlink:type="simple"/></disp-formula><p>which is the nugget component of the robust variogram, is determined by a weighted cubic extrapolation (Bachmaier [<xref ref-type="bibr" rid="scirp.3161-ref23">23</xref>]). The weights decrease linearly with increasing separation distance; in particular:<img src="4-8500013\d07cc16c-401d-4f71-9888-b83421294467.jpg" />,<img src="4-8500013\6f92b7ca-ec7a-4ec4-a7c0-77baacd6c7ce.jpg" /> ,<img src="4-8500013\91c8537b-a4ec-48e4-9ca1-43da9631a6d0.jpg" /> , and <img src="4-8500013\016bdd8c-e86a-4aeb-8b73-8c6e82252e2e.jpg" /> for<img src="4-8500013\8a049f33-0724-4310-9505-f886764fc443.jpg" />, where <img src="4-8500013\b7bfc254-0987-479d-b942-3a50dcb573ba.jpg" /> represents the class<img src="4-8500013\9cb900bf-f7d4-40d5-81a2-221df1a818d3.jpg" />, <img src="4-8500013\4d011053-578d-4b61-b7c9-d894a35b26a9.jpg" />the class<img src="4-8500013\cd6cc6da-19ca-4715-ab8a-90ef2909b74b.jpg" />, and so on.</p><p>Analogously to (34), the robust variance of all measured values, is estimated as a weighted mean of all values depicted in the robust variogram:</p><disp-formula id="scirp.3161-formula102514"><label>. (39)</label><graphic position="anchor" xlink:href="4-8500013\63c73a1c-4b66-480b-82fd-d54304942cfb.jpg"  xlink:type="simple"/></disp-formula><p>Thus, the variance estimation in (37) depends only on robust variogram values. They are barely affected by outliers, so they underestimate the theoretical classical variogram values a little, as these also include the variance of the yield monitor, which might be large because of the outliers. But since the underestimation of the robust variances affects all variances in the variogram similarly, the difference in (37), which gives the desired variance of the true yields, is barely affected. Therefore, this estimation method worked well, as shown in the Monte Carlo results in Bachmaier [<xref ref-type="bibr" rid="scirp.3161-ref23">23</xref>].</p></sec></sec><sec id="s4"><title>4. Results for the Field Bandstauden in Zeilitzheim</title><p>In 2001 the winter wheat harvest of Bandstauden field in Zeilitzheim (Germany) was investigated. Using the measurements for moisture, the raw yield data measured by the Claas Agrocom Quantimeter with a data logging frequency of 0.2 Hz were converted to dry matter yields. Zero yields were omitted as were those with a missing value for moisture. Values with technical errors were assigned the global weight zero, which meant they were discarded also.</p><p>Figures 10(a), (b) shows the yield monitor data in Mg ha<sup>-1</sup> and their global weights,<img src="4-8500013\f0a642c5-cd3f-4ca4-b38f-dc2090aad6ca.jpg" /> , respectively. Weights less than 1 arise from a small deviation from the preceding harvest path or from values at the beginning of a swath. Many neighboring swathes were harvested in the same direction, therefore, regions with invalid values are larger than usual.</p><sec id="s4_1"><title>4.1. Yield Maps Generated Using Different Neighborhood Size</title><p><xref ref-type="fig" rid="fig1">Figure 1</xref>1 shows three yield maps for different sizes of an elliptical neighborhood with a radius ratio of<img src="4-8500013\48864df9-3eac-4305-8ee9-aff1b4380977.jpg" />. It decreases according to <img src="4-8500013\7b93c2d2-f09f-4e0b-8af8-d2f0334a1ae4.jpg" /> in (28) if a point is mapped whose nearer environment does not contain enough valid data points.</p><p>The larger the neighborhood size, the smoother the yield map is and the less is its variance. With a fine -textured yield map in <xref ref-type="fig" rid="fig1">Figure 1</xref>1(a), there is a risk that it shows the errors of the yield monitor too clearly, whereas <xref ref-type="fig" rid="fig1">Figure 1</xref>1(c) shows too much smoothing. To determine approximate degree of smoothness of the yield map, the variance of the true unknown yields must be obtained and compared with the variances given in <xref ref-type="fig" rid="fig1">Figure 1</xref>1.</p></sec><sec id="s4_2"><title>4.2. The Adequate Smoothness of the Yield Map</title><p>An adequate neighborhood size is obtained if the variance of the yield map equals that of the true yields. The variance estimation of the true yields is described from (36) to (39). It is based on the robust variogram in <xref ref-type="fig" rid="fig1">Figure 1</xref>2, which clearly contains a trend component of the unknown yield map. Note that such a variogram would not be appropriate to determine kriging weights. For this, the variogram should be trend-adjusted. But a variogram that serves to determine the variance of the yield map must also contain the trend, as the trend is an essential part of it. The robust variogram in <xref ref-type="fig" rid="fig1">Figure 1</xref>2 is drawn at the points<img src="4-8500013\e33cb8c9-5e5e-4a8b-bf81-f2a3751df33c.jpg" />, the mean of values <img src="4-8500013\bfdbb4dc-20ca-43ee-ac2f-30505f7d0789.jpg" /> within class<img src="4-8500013\96bd9c61-4d6f-4c72-aa85-91f40da2f66b.jpg" />.</p><p>According to (37), the variance of the true yields is estimated by</p><disp-formula id="scirp.3161-formula102515"><label>(40)</label><graphic position="anchor" xlink:href="4-8500013\f789492c-3959-4a58-b882-fa560d9b5d19.jpg"  xlink:type="simple"/></disp-formula><p>where the robust error variance, <img src="4-8500013\4a0f7f3b-b817-4daa-8686-be6bf5a20659.jpg" />, equals the nugget component, <img src="4-8500013\e78f788c-5cdb-4be6-87e9-ef2ff350412c.jpg" />, which was obtained as the intercept on the ordinate of the extrapolated robust variogram in <xref ref-type="fig" rid="fig1">Figure 1</xref>2. The robust variance of all measured values,<img src="4-8500013\8668c4bc-cc49-4a82-b89e-cb25e646a5b7.jpg" /> , was computed as the weighted mean in (39), which refers to all values depicted in the robust variogram.</p><p>Since the yield map is required to be as smooth as the true yields, <xref ref-type="fig" rid="fig1">Figure 1</xref>1(b) should be chosen because its variance of 0.242 equals the estimated variance of the true unknown yields in (40). The elliptical neighborhood used to generate this yield map started with a minimum <img src="4-8500013\8ad029cd-b462-499a-9301-c342c6931b9d.jpg" /> of ten times the swath width (and, since<img src="4-8500013\cbf4d380-2ea7-44ca-879a-a7fe49e6e33a.jpg" />,</p><p>with a minimum <img src="4-8500013\05fd1055-621c-4608-b5eb-948a4fdf7e47.jpg" /> of five times the swath width) and was increased until the minimum effective number of data reached<img src="4-8500013\547e1d52-94ac-4de6-a996-f144ee69bfa7.jpg" />. This number needs to be adapted to the neighborhood size. It should approximately correspond to the that <img src="4-8500013\033ce921-8f68-4005-85eb-41ac0f1f1a58.jpg" /> which is expected if most measurements of the neighborhood are valid. It depends on the data logging frequency, on the swath width, and, of course, on the neighborhood size.</p></sec></sec><sec id="s5"><title>5. Discussion and Conclusions</title><p>The yield mapping method proposed in this article is based mainly on modeling paraboloid cones on moving elliptical neighborhoods. The method of determining the parameters of the model is robust, so that the detection of outliers was not necessary. Besides, the method refers to weights that decrease, like a paraboloid cone opening downwards, to zero at the neighborhood’s border. This corresponds to a smooth transition from being fully considered to being not considered at all, so that a larger neighborhood is necessary, as the values close to the border do not have full weight. A robust variogram, computed independently of the yield mapping method, served to estimate the variance of the true unknown yield values. This provided a measure of how smooth the yield map would be if all values had been measured correctly. For an elliptical neighborhood with at least<img src="4-8500013\08f03b54-5605-4e91-94b1-b116026c6eaa.jpg" />, <img src="4-8500013\e0fc3beb-5d25-4c7e-96cc-06ab50ed6f8a.jpg" />(i.e.<img src="4-8500013\7a028194-36cc-4283-b114-96fc05db7816.jpg" />) and a minimum effective number of <img src="4-8500013\4eca606e-d588-44d0-a70f-8b13519d6940.jpg" /> the estimated yield map had the same variance as the true unknown yield map.</p><p>In 2004, an experiment was done on a part of the Lamprechtsfeld in Thalhausen (Germany) where measured reference values for yield data from two yield monitors were obtained; Data Vision Flowcontrol and AgLeader. The results based on butterfly neighborhoods suggest that, if one wants a yield map whose sum of squared deviations from the true unknown values is minimized, the neighborhood size should be greater than that of <xref ref-type="fig" rid="fig1">Figure 1</xref>1(b). However, the yield map optimized under this criterion is much smoother than the map of the true reference values, i.e., its variance is too small. In Bachmaier et al. [<xref ref-type="bibr" rid="scirp.3161-ref24">24</xref>], where these results are published, the shape of the butterfly neighborhood was also optimized.</p><p>The method proposed in the present paper cannot be used to optimize the neighborhood’s shape, but it gets along with yield monitor data only. It could already be seen from <xref ref-type="fig" rid="fig9">Figure 9</xref>(a). in Bachmaier [<xref ref-type="bibr" rid="scirp.3161-ref1">1</xref>] that a circular neighborhood did not sufficiently smooth out the yield data across the tracks; these or other lines cannot be recognized in the present <xref ref-type="fig" rid="fig1">Figure 1</xref>1, which is based on an elliptical neighborhood that is twice as long (<img src="4-8500013\2f013f5a-13b4-4aec-b546-2cdb1e06c289.jpg" />) across the tracks than it is along them. Therefore, a radius ratio of <img src="4-8500013\9ff2a32b-b6ff-47b5-8d48-11578c0588e2.jpg" /> might be a good choice.</p><p>Currently, the proposed method is applied to a single yield monitor only (Claas Agrocom Quantimeter) with a data logging frequency of 0.2 Hz and one crop. It could also be applied to harvests from other crops or from combines equipped with other cutting widths and other yield monitors. This might lead to different results, however, the yield mapping rule proposed here would produce yield maps that are not too smooth, not too fine-textured and not subject to large outliers. The reason is that it requires at least a radius of <img src="4-8500013\ad34e34b-6078-4216-8268-fd629bc4175a.jpg" /> and it does not depend on the swath width, sw, because the radii are expressed in multiples of it. This also guarantees robustness against exclusively erroneous measurements on one or two harvest tracks within the neighborhood. The method requires a minimum effective number of data,<img src="4-8500013\98d35046-f78f-4f05-9a2f-9989adf2c31a.jpg" />. A yield monitor with a higher frequency, such as the AgLeader monitor, provides more data, so <img src="4-8500013\0a8cc643-a069-49be-91c1-316bb3a7335d.jpg" /> would be reached within a smaller neighborhood. But these data are usually less accurate, so the neighborhood size should not decrease. What should be adapted to the yield monitor is the number,<img src="4-8500013\7b2b0977-d876-4349-a03d-3a3482ad7753.jpg" /> ,of zero -weights at the beginning of harvest tracks. It should increase with the frequency of the system and be adapted to the intensity of its smoothing algorithm. In the Fortran program paraboloidmapping (Bachmaier [<xref ref-type="bibr" rid="scirp.3161-ref25">25</xref>]), where the yield mapping method is implemented, I currently use m = 5 for Claas Agrocom, <img src="4-8500013\2eb2ded3-9394-41bb-87b6-d118e65ec2ee.jpg" />for Data Vision Flowcontrol, which has a stronger smoothing algorithm, and <img src="4-8500013\9d0d022c-752d-4622-a008-1592a5f227e2.jpg" /> for AgLeader because of its high frequency of 1 Hz.</p></sec><sec id="s6"><title>6. Acknowledgements</title><p>I thank Margaret A. Oliver, University of Reading, for her helpful and detailed remarks on this research and Hermann Auernhammer, Technische Universit&#228;t M&#252;nchen-Weihenstephan, for supporting this work.</p></sec><sec id="s7"><title>REFERENCES</title></sec></body><back><ref-list><title>References</title><ref id="scirp.3161-ref1"><label>1</label><mixed-citation publication-type="other" xlink:type="simple">M. Bachmaier, “Using a Robust Variogram to Find an Adequate Butterfly Neighborhood Size for One-Step Yield Mapping Using Robust Fitting Paraboloid Cones,” Precision Agriculture, Vol. 8, No. 1-2, 2007, pp. 75-93.</mixed-citation></ref><ref id="scirp.3161-ref2"><label>2</label><mixed-citation publication-type="other" xlink:type="simple"> 
G. C. Simbahan, A. Dobermann and J. L. Ping, “Site- Specific Management. Screening Yield Monitor Data Improves Grain Yield Maps,” Agronomy Journal, Vol. 96, No. 4, 2004, pp. 1091-1102.</mixed-citation></ref><ref id="scirp.3161-ref3"><label>3</label><mixed-citation publication-type="other" xlink:type="simple"> 
S. A. Shearer, S. G. Higgins, S. G. McNeil, G. A. Watkins, R. I. Barnhisel, J. C. Doyle, J. H. Leach and J. P. Fulton, “Data Filtering and Correction Techniques for Generating Yield Maps from Multiple Combine Harvesting Systems,” ASAE Paper No. 971034, Annual International Meeting, Minneapolis Minnesota, 10-14 August 1997.</mixed-citation></ref><ref id="scirp.3161-ref4"><label>4</label><mixed-citation publication-type="other" xlink:type="simple"> 
B. S. Blackmore and M. Moore, “Remedial Correction of Yield Map Data,” Precision Agriculture, Vol. 1, No. 1, 1999, pp. 53-66.</mixed-citation></ref><ref id="scirp.3161-ref5"><label>5</label><mixed-citation publication-type="other" xlink:type="simple"> 
S. Arslan and T. S. Colvin, “Grain Yield Mapping: Yield Sensing, Yield Reconstruction, and Errors,” Precision Agriculture, Vol. 3, No. 2, 2002, pp. 135-154.</mixed-citation></ref><ref id="scirp.3161-ref6"><label>6</label><mixed-citation publication-type="other" xlink:type="simple"> 
T. Steinmayr, “Fehleranalyse und Fehlerkorrektur bei der lokalen Ertragsermittlung im M?hdrescher zur Ableitung eines standardisierten Algorithmus für die Ertragskartierung,” Dissertation at the Technische Universit?t Mün- chen-Weihenstephan, 2003, pp. 24-27.</mixed-citation></ref><ref id="scirp.3161-ref7"><label>7</label><mixed-citation publication-type="other" xlink:type="simple"> 
L. Thylén, P. A. Algerbo and A. Giebel, “An Expert Filter Removing Erroneous Yield Data,” In: Robert et al., Ed., Precision Agriculture 2000: Proceedings of the 5th International Conference on Precision Agriculture, ASA, CSSA, and SSSA, Madison, WI, CD-ROM, 2001.</mixed-citation></ref><ref id="scirp.3161-ref8"><label>8</label><mixed-citation publication-type="other" xlink:type="simple"> 
L. Thylén, P. Jürschik and D. P. L. Murphy, “Improving the Quality of Yield Data,” In: J. V. Stafford, Ed., Precision Agriculture '97: Proceedings of the 1st European Conference on Precision Agriculture, BIOS Scientific Publishers Ltd, Oxford, UK, 1997, pp. 743-750.</mixed-citation></ref><ref id="scirp.3161-ref9"><label>9</label><mixed-citation publication-type="other" xlink:type="simple"> 
P. H .Noack, T. Muhr and M. Demmel, “An Algorithm for Automatic Detection and Elimination of Defective Yield Data,” In: J.V. Stafford and A. Werner, Ed., Precision Agriculture, Wageninen Academic Publishers, Wageningen, 2003, pp. 445-450.</mixed-citation></ref><ref id="scirp.3161-ref10"><label>10</label><mixed-citation publication-type="other" xlink:type="simple"> 
F. R. Hampel, P. J. Rousseeuw, E. M. Ronchetti and W. A. Stahel, “Robust Statistics,” John Wiley &amp; Sons, Inc., New York, USA, 1986.</mixed-citation></ref><ref id="scirp.3161-ref11"><label>11</label><mixed-citation publication-type="other" xlink:type="simple"> 
M. Bachmaier and H. Auernhammer, “Yield Mapping Based on Robust Fitting Paraboloid Cones in Butterfly and Elliptic Neighborhoods,” In: J. V. Stafford and A. Werner, Ed., Precision Agriculture '05: Proceedings of the 5th European Conference on Precision Agriculture, Wageningen Academic Publishers, Wageningen, the Netherlands, 2005, pp. 741-750.</mixed-citation></ref><ref id="scirp.3161-ref12"><label>12</label><mixed-citation publication-type="other" xlink:type="simple"> 
P. J. Huber, “Robust Statistics,” John Wiley &amp; Sons, Inc., New York, USA, 1981.</mixed-citation></ref><ref id="scirp.3161-ref13"><label>13</label><mixed-citation publication-type="other" xlink:type="simple"> 
F. W. Satterthwaite, “An Approximate Distribution of Estimates of Variance Components,” Biometrics Bulletin, Vol. 2, No. 6, 1946, pp. 110-114.</mixed-citation></ref><ref id="scirp.3161-ref14"><label>14</label><mixed-citation publication-type="other" xlink:type="simple"> 
P. J. Huber, “Robust Statistical Procedures,” Regional Conference Series in Applied Mathematics 27, Society for Industrial and Applied Mathematics, Philadelphia, Pennsylvania, 1977.</mixed-citation></ref><ref id="scirp.3161-ref15"><label>15</label><mixed-citation publication-type="other" xlink:type="simple"> 
M. Bachmaier and H. Auernhammer, “A Method for Correcting Raw Yield Data by Fitting Paraboloid Cones,” In: Proceedings of the AgEng 2004 Conference — Engineering the Future, Session 10: Precision Agriculture, 8 pages on CD-ROM, 2004.</mixed-citation></ref><ref id="scirp.3161-ref16"><label>16</label><mixed-citation publication-type="other" xlink:type="simple"> 
N. A. C. Cressie, “Statistics for Spatial Data,” John Wiley &amp; Sons, Inc., New York, USA, 1991.</mixed-citation></ref><ref id="scirp.3161-ref17"><label>17</label><mixed-citation publication-type="other" xlink:type="simple"> 
M. Bachmaier and M.Backes, “Variogram or Semivariogram? — Understanding the Variances in a Variogram,” Precision Agriculture, Vol. 9, No. 3, 2008, pp. 173-175.</mixed-citation></ref><ref id="scirp.3161-ref18"><label>18</label><mixed-citation publication-type="other" xlink:type="simple"> 
R. M. Lark, “A Comparison of Some Robust Estimators of the Variogram for Use in Soil Survey,” European Journal of Soil Science, Vol. 51, No. 1, 2000, pp. 137-157.</mixed-citation></ref><ref id="scirp.3161-ref19"><label>19</label><mixed-citation publication-type="other" xlink:type="simple"> 
N. A. C. Cressie and D. Hawkins, “Robust Estimation of the Variogram,” Mathematical Geology, Vol. 12, No. 2, 1980, pp. 115-125.</mixed-citation></ref><ref id="scirp.3161-ref20"><label>20</label><mixed-citation publication-type="other" xlink:type="simple"> 
P. A. Dowd, “The Variogram and Kriging: Robust and Resistant Estimators,” In: G. Verly et al., Ed., Geostatistics for Natural Resources Characterization, D. Reidel, Dordrecht, Part 1, 1984, pp. 91-106.</mixed-citation></ref><ref id="scirp.3161-ref21"><label>21</label><mixed-citation publication-type="other" xlink:type="simple"> 
M. G. Genton, “Highly Robust Variogram Estimation,” Mathematical Geology, Vol. 30, No. 2, 1998, pp. 213-221.</mixed-citation></ref><ref id="scirp.3161-ref22"><label>22</label><mixed-citation publication-type="other" xlink:type="simple"> 
M. Bachmaier, “Efficiency Comparison of M-Estimates for Scale at t-Distributions,” Statistical Papers, Vol. 41, No. 1, 2000, pp. 53-64.</mixed-citation></ref><ref id="scirp.3161-ref23"><label>23</label><mixed-citation publication-type="other" xlink:type="simple"> 
M. Bachmaier, “Finding Adequate Neighborhoods for Robust Yield Mapping,” 2006. http://www.tec.wzw.tum. de/ index.php?id=46</mixed-citation></ref><ref id="scirp.3161-ref24"><label>24</label><mixed-citation publication-type="other" xlink:type="simple"> 
M. Bachmaier, M. Rothmund and H.Auernhammer, An “Attempt to Optimize Yield Maps by Comparing Yield Data from a Plot Combine and from a Combine Harvester,” Agricultural Engineering International, Manuscript IT 07 009, Vol. 10, 12 pages, 2008. http://cigr- ejournal.tamu.edu</mixed-citation></ref><ref id="scirp.3161-ref25"><label>25</label><mixed-citation publication-type="other" xlink:type="simple"> 
M. Bachmaier, Fortran-Programme zum Herunterladen: Korrektur von Rohertragsdaten und Ertragskartierung: DOS/Windows-Programme paraboloidmapping.exe, and paraboloidcorrection.exe (in English), 2010http://www. tec.wzw.tum.de/index.php?id=46</mixed-citation></ref></ref-list></back></article>