<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE article  PUBLIC "-//NLM//DTD Journal Publishing DTD v3.0 20080202//EN" "http://dtd.nlm.nih.gov/publishing/3.0/journalpublishing3.dtd"><article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" dtd-version="3.0" xml:lang="en" article-type="research article"><front><journal-meta><journal-id journal-id-type="publisher-id">AM</journal-id><journal-title-group><journal-title>Applied Mathematics</journal-title></journal-title-group><issn pub-type="epub">2152-7385</issn><publisher><publisher-name>Scientific Research Publishing</publisher-name></publisher></journal-meta><article-meta><article-id pub-id-type="doi">10.4236/am.2014.51013</article-id><article-id pub-id-type="publisher-id">AM-41818</article-id><article-categories><subj-group subj-group-type="heading"><subject>Articles</subject></subj-group><subj-group subj-group-type="Discipline-v2"><subject>Physics&amp;Mathematics</subject></subj-group></article-categories><title-group><article-title>
 
 
  A New Maximum Test via the Dependent Samples t-Test and the Wilcoxon Signed-Ranks Test
 
</article-title></title-group><contrib-group><contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>averpierre</surname><given-names>Maggio</given-names></name><xref ref-type="aff" rid="aff1"><sup>1</sup></xref><xref ref-type="corresp" rid="cor1"><sup>*</sup></xref></contrib><contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Shlomo</surname><given-names>S. Sawilowsky</given-names></name><xref ref-type="aff" rid="aff2"><sup>2</sup></xref></contrib></contrib-group><aff id="aff2"><addr-line>Department of Evaluation and Research, Wayne State University, Detroit, USA</addr-line></aff><aff id="aff1"><addr-line>Department of Psychology, University of Windsor, Windsor, Canada;Department of Evaluation and Research, Wayne State University, Detroit, USA</addr-line></aff><author-notes><corresp id="cor1">* E-mail:<email>spmaggio@uwindsor.ca(AM)</email>;</corresp></author-notes><pub-date pub-type="epub"><day>25</day><month>12</month><year>2013</year></pub-date><volume>05</volume><issue>01</issue><fpage>110</fpage><lpage>114</lpage><permissions><copyright-statement>&#169; Copyright  2014 by authors and Scientific Research Publishing Inc. </copyright-statement><copyright-year>2014</copyright-year><license><license-p>This work is licensed under the Creative Commons Attribution International License (CC BY). http://creativecommons.org/licenses/by/4.0/</license-p></license></permissions><abstract><p>
 
 
   A maximum test in lieu of forcing a choice between the two dependent samples t-test and Wilcoxon signed-ranks test is proposed. The maximum test, which requires a new table of critical values, maintains nominal <em>α</em> while guaranteeing the maximum power of the two constituent tests. Critical values, obtained via Monte Carlo methods, are uniformly smaller than the Bonferroni-Dunn adjustment, giving it power superiority when testing for treatment alternatives of shift in location parameter when data are sampled from non-normal distributions. 
 
</p></abstract><kwd-group><kwd>Maximum Test; Dependent Samples t-Test; Wilcoxon Signed-Ranks Test; Bonferroni-Dunn Adjustment; Experiment-Wise Type I Error; Inferential Statistics; Monte Carlo Method</kwd></kwd-group></article-meta></front><body><sec id="s1"><title>1. Introduction</title><p>The process of selection of a test can be complicated, confusing, and in some cases disappointing. In choosing a particular test, consideration must be given to its robustness properties with respect to Type I errors for departures from population normality and that it has favorable comparative statistical power [<xref ref-type="bibr" rid="scirp.41818-ref1">1</xref>]. When parametric conditions are violated one test or another may be found to be more powerful than its competitor analogue under a given set of circumstances [<xref ref-type="bibr" rid="scirp.41818-ref1">1</xref>]. Therefore, the choice is often left to intuitive selection or guesswork [<xref ref-type="bibr" rid="scirp.41818-ref2">2</xref>].</p><p>In the context of two samples when testing for a treatment modeled as a shift in location parameter, both the parametric t-test and nonparametric Wilcoxon signed-ranks (WSR) test are possibilities. Under normality, the t-test is the uniformly most powerful unbiased test. However, that distinction is lost when data are sampled from non-normal distributions [<xref ref-type="bibr" rid="scirp.41818-ref3">3</xref>]. Blair and Higgins [<xref ref-type="bibr" rid="scirp.41818-ref3">3</xref>] found no instance where the t-test held more than a modest power advantage over the Wilcoxon test, whereas the latter held a clear power advantage, as much as 0.895, over the t-test for data obtained from skewed distributions. See also references [4-8].</p><p>Although they disagreed with the practice, Sawilowsky and Fahoome [<xref ref-type="bibr" rid="scirp.41818-ref9">9</xref>] noted transforming data to fit parametric assumptions which is commonly recommended when confronted with nonnormally distributed data. This practice (a) suffers from not knowing a priori which transformation is to use, and (b) conducts the hypothesis test on a metric that is often irrelevant to the research context [<xref ref-type="bibr" rid="scirp.41818-ref9">9</xref>]. Other solutions, such as conducting both the t-test and the Wilcoxon signed-ranks test and accepting the latter if it is significant, or conducting a preliminary test for normality prior to a test of effects, merely serve to increase the Experiment-Wise Type I error rate [<xref ref-type="bibr" rid="scirp.41818-ref9">9</xref>].</p></sec><sec id="s2"><title>2. Maximum Test</title><p>Consider the two dependent samples layout. A resolution may be obtained in the form of a maximum test, based on taking the more significant result of the dependent sample t (DT) and Wilcoxon signed-ranks (WSR) tests. In order to avoid inflation of Type I errors, the larger of the two obtained critical values must then be compared with the appropriate critical value obtained from the joint sampling distribution of the two tests.</p><p>A maximum test has useful diagnostic properties with extreme component statistics that is sensitive to any statistical departures from normal null hypothesis [<xref ref-type="bibr" rid="scirp.41818-ref2">2</xref>]. The advantage of the maximum test was “the fact that the test is automatically adaptive to the weight in the tail of the population from which the data were sampled” [2, p. 17]. Type I error rates are not inflated, because the critical value of a maximum test is a single test based on the joint sampling distribution of the component tests. Other important advantages of the maximum test are (1) avoidance of the task to choose one test over another or others without running the risk of power losses and (2) avoidance of the task to conduct post hoc alpha adjustments such as the Bonferroni-Dunn.</p></sec><sec id="s3"><title>3. Statement of the Problem</title><p>The three-fold purpose of this study is to 1) create a maximum test using the parametric dependent samples t-test and the non-parametric Wilcoxon sign rank test 2) obtain critical values via Monte Carlo methods using sample deviates obtained randomly and with replacement from a contaminated (mixed) normal distribution. Critical values will be derived for nominal alpha levels of 0.05, 0.025, 0.01 and 0.005 for sample sizes 8 - 30, 45, 60, 90, and 120, and 3) to demonstrate the critical values compare favorably to the Bonferroni-Dunn adjustment.</p></sec><sec id="s4"><title>4. Model</title><p>Algina, Blair and Coombs [10, p. 28] defined the maximum test as a statistic “for a particular data set, two or more statistics and test the same hypothesis and selecting as the test statistic the one with the smallest p value” and in the event that “each statistic has the same critical value the maximum statistic is simply the most extreme of the calculated statistics”.</p><p>Cox [2, p. 50] described a significance test as a procedure for measuring the consistency of data with a null hypothesis H<sub>o</sub> having the form where “an observed vector, y, of response variables, or sometimes written as y<sub>obs</sub>, and null hypothesis H<sub>o</sub> according to which y is the observed value of a random variable Y, with sampling space S<sub>y</sub>, and having a probability density <inline-formula><inline-graphic xlink:href="tmlimages\13-7401912x\159f4419-690c-44b9-803a-9457d8b2143d.png" xlink:type="simple"/></inline-formula> in some family H<sub>o</sub>.” The function <inline-formula><inline-graphic xlink:href="tmlimages\13-7401912x\7ca50ac7-859e-43bb-a140-ab1b01c8a52b.png" xlink:type="simple"/></inline-formula> of the observations, or a test statistic. The corresponding random variable is represented by T. Thus the observed level of significance and the allowance for selection is,</p><disp-formula id="scirp.41818-formula33001"><label>(1)</label><graphic position="anchor" xlink:href="htmlimages\13-7401912x\85b4d126-4e50-4b02-bba8-a4bb5f656347.png"  xlink:type="simple"/></disp-formula><p>where p<sub>obs</sub> is the observed value of a random variable. Cox [<xref ref-type="bibr" rid="scirp.41818-ref2">2</xref>] provided the following form of the a maximum test which Cox [<xref ref-type="bibr" rid="scirp.41818-ref2">2</xref>] credited to Tippet [<xref ref-type="bibr" rid="scirp.41818-ref11">11</xref>],</p><disp-formula id="scirp.41818-formula33002"><label>(2)</label><graphic position="anchor" xlink:href="htmlimages\13-7401912x\d6d00b3b-7b3c-43ed-b525-6972be81549a.png"  xlink:type="simple"/></disp-formula><p>where p<sub>j</sub> is the significance level in the j<sup>th</sup> test and small values of q are evidence against H<sub>o</sub>. The required level of significance and the allowance for selection was noted as</p><disp-formula id="scirp.41818-formula33003"><label>. (3)</label><graphic position="anchor" xlink:href="htmlimages\13-7401912x\4266086e-1644-4d0a-a1cd-d71400b6c677.png"  xlink:type="simple"/></disp-formula><p>If component tests are independent and continuous (3) becomes<inline-formula><inline-graphic xlink:href="tmlimages\13-7401912x\7a9fb0e7-87fa-4ad1-9f51-40cc3edae32b.png" xlink:type="simple"/></inline-formula>, and that an upper bound for (3) is in any case kq<sub>obs</sub>.</p><p>Hence the maximum test via the dependent sample t-test and the Wilcoxon signed-rank test was defined as</p><disp-formula id="scirp.41818-formula33004"><label>. (4)</label><graphic position="anchor" xlink:href="htmlimages\13-7401912x\79dbce78-886f-4909-be46-7ea8cbdc3783.png"  xlink:type="simple"/></disp-formula><p>Both the DT and WSR were computed in the same metric as the DT. The variable t<sub>WSR</sub> refers to the probability of the acquired Z from the WSR test that was converted to an obtained t associated with <inline-formula><inline-graphic xlink:href="tmlimages\13-7401912x\3b1db9a5-2e90-4e26-92b5-17fc15ea7971.png" xlink:type="simple"/></inline-formula> degrees of freedom. In the event scores are tied <inline-formula><inline-graphic xlink:href="tmlimages\13-7401912x\b17940bf-6441-491d-b0f4-b2b52c54169a.png" xlink:type="simple"/></inline-formula> then either DT or the WSR was used.</p>Assumptions<p>The maximum test does not create any new assumptions. Although the critical values are derived from a specific mixed normal distribution, they are useful for general mixed normal distributions and other non-normal shapes.</p></sec><sec id="s5"><title>5. Method</title><p>Monte Carlo simulation methods were used to obtain critical values. A FORTRAN program employing various subroutines of the International Mathematical and Statistical Libraries [<xref ref-type="bibr" rid="scirp.41818-ref12">12</xref>] was developed in order to create the test and to obtain critical values.</p><p>Deviates were randomly sampled from a contaminated (mixed normal) distribution based on two populations which differ in respective means and/or variances. It was formed by sampling with a probability of 0.95 from a normal distribution with mean of 0 and a standard deviation of 1, and with a probability of 0.05 from a normal distribution with a mean equal to 22 and a standard deviation of 10. The contaminated (mixed normal) distribution was chosen for this study because it is familiar to many readers, it’s commonly used in robustness studies and they are important population models across a variety of disciplines [<xref ref-type="bibr" rid="scirp.41818-ref13">13</xref>]. This maximum test could generally be used for any model of mixed normal populations, and to a lesser extent any non-normal model.</p><p>Critical values were obtained as follows. Random deviates were assigned to two groups. Both the dependent samples t-test and the Wilcoxon signed-ranks test were computed. Subsequently, the probability of the obtained Z from the Wilcoxon test was converted to an obtained t that would be associated with the degrees of freedom using the IMSL tin(p, df) subroutine [<xref ref-type="bibr" rid="scirp.41818-ref12">12</xref>]. Then, the two obtained t values were compared, and whichever was the higher was recorded. This process was repeated 200,000 times and the results were stored in an array, which was subsequently sorted from low to high. Then, the value corresponding to the percentile associated with the desired alpha level was selected. For example, the value at the 95<sup>th</sup> percentile represents the critical value for the nominal alpha = 0.05 level.</p></sec><sec id="s6"><title>6. Monte Carlo Simulations Results</title><sec id="s6_1"><title>6.1. Critical Values</title><p>Critical values were selected from the ordered array to represent values at the 0.05, 0.025, 0.01 and 0.005 significance levels. This process was repeated for sample sizes n = 8 through 30, 45, 60, 90 and 120. The critical values are presented in Table1</p><p>As expected, the tabled critical values moved inversely with sample size, meaning that as the sample size increased the tabled values decreased. In some instances, critical values reversed direction, and then returned to the descending pattern as n increased. In a few instances, the critical values repeated at different samples sizes and alpha levels. Both anomalies are attributed to the computational nature of the maximum test, and were previously noted by other workers on the maximum test [9,14]. Hence, although these two phenomena appear antithetical, they are expected and are not disconcerting.</p></sec><sec id="s6_2"><title>6.2. Inspection of Maximum Test Critical Values and Comparison with Bonferroni-Dunn Critical Values</title><p>As noted in Tables 2 and 3 on the following page, the critical values for the maximum test were systematically larger than critical values for the t-test. This behavior controls the inflation of Type I errors.</p><p>The fallback plan in avoiding Type I error inflation is to use a Bonferroni-Dunn adjustment when conducting multiple statistical tests. Bonferroni-Dunn attempts to control the probability of rejecting at least one true hypothesis at some specified level a by testing each of the hypotheses of interest at level of significance α. It is used “when conducting multiple tests of significance to set an upper bound on the overall significance level α” [<xref ref-type="bibr" rid="scirp.41818-ref15">15</xref>]. Simes [15, p. 751] explained that “If<inline-formula><inline-graphic xlink:href="tmlimages\13-7401912x\45f649e0-3ddd-4b2e-a7bd-04460fa2e677.png" xlink:type="simple"/></inline-formula>” is a set of n statistics with corresponding p-values<inline-formula><inline-graphic xlink:href="tmlimages\13-7401912x\d5dc474b-d050-4383-a742-17edec03a215.png" xlink:type="simple"/></inline-formula>, for testing hypotheses<inline-formula><inline-graphic xlink:href="tmlimages\13-7401912x\f016179e-3ad8-4297-bd8a-b6e2ac83b52f.png" xlink:type="simple"/></inline-formula>, the classical Bonferroni multiple test procedure is usually performed by rejecting <inline-formula><inline-graphic xlink:href="tmlimages\13-7401912x\69dc7584-5298-4049-bb4c-7b0807c236b6.png" xlink:type="simple"/></inline-formula> if any p-value is less than α/n. Furthermore the specific hypothesis H<sub>1</sub> is rejected for each<inline-formula><inline-graphic xlink:href="tmlimages\13-7401912x\afc16ecd-b9e7-49db-aa9f-b2e6ebff1b0d.png" xlink:type="simple"/></inline-formula>.” Usually, the adjustment is too conservative and lacks requisite power to reject an individual hypothesis as the number of tests increases, thereby having the effect of missing real differences [<xref ref-type="bibr" rid="scirp.41818-ref16">16</xref>].</p><p>Critical values for the maximum test are compared with values obtained via Bonferroni-Dunn adjustment, <inline-formula><inline-graphic xlink:href="tmlimages\13-7401912x\7ea72cf2-b5e4-4ed1-ab5b-61d737750df8.png" xlink:type="simple"/></inline-formula>, where NT refers to the number of tests being conducted. Note that in Tables 2 and 3 above, the maximum test’s critical values are systematically lower than the Bonferroni-Dunn test, and hence, it will be a more powerful test [<xref ref-type="bibr" rid="scirp.41818-ref17">17</xref>]. See [17, p. 259] on comparing statistical tests at specific α levels when one test cannot be conducted at that precise level due to the test’s discrete sampling distribution.</p></sec><sec id="s6_3"><title>6.3. Example of Use of the Maximum Test</title><p>Suppose a test of difference in average performance of a treatment versus a control group was conducted with n<sub>1</sub> = n<sub>2</sub> = 20, for a two-sided test with α = 0.05. The first step is to conduct both the dependent t-test and Wilcoxon signed-ranks test. The second step is to select whichever obtained statistic is higher in magnitude (i.e., select the statistic whose absolute value is greater). The third step is to enter <xref ref-type="table" rid="table1">Table 1</xref> with n = 20, α = 0.025, and retrieve the critical value of &#177;2.189228. If the obtained maximum statistic is either greater than 2.189228, or if it is less than −2.189228, reject the null hypothesis in favor of the alternative.</p></sec></sec><sec id="s7"><title>7. Discussion and Conclusions</title><p>The maximum test’s critical values were systematically lower than those of the Bonferroni-Dunn test and therefore the more powerful test, and were systematically larger than critical values for the t-test which controls the inflation of Experiment-Wise Type I errors when conducting both the dependent samples t and Wilcoxon signed-ranks test. The maximum test eliminates the need to make a forced choice between the dependent sample t-test and the WSR test when the distribution from which samples are drawn remain unknown, or are known to be non-normally distributed. The test permits the safe application of both the classical and non-parametric tests with the maximum of the two referred to the new table of critical values that are designed to maintain the Type I error rate to nominal α while guaranteeing the maximum power of the two tests. The maximum test also renders the Bonferroni-Dunn adjustment method unnecessary.</p></sec><sec id="s8"><title>REFERENCES</title></sec></body><back><ref-list><title>References</title><ref id="scirp.41818-ref1"><label>1</label><mixed-citation publication-type="other" xlink:type="simple">C. R. Blair, “Combining Two Nonparametric Tests of Location,” Journal of Modern Applied Statistical Methods, Vol. 1, No. 1, 2002, pp. 13-18.</mixed-citation></ref><ref id="scirp.41818-ref2"><label>2</label><mixed-citation publication-type="other" xlink:type="simple">D. R. Cox, “The Role of Significance Tests,” Scandinavian Journal of Statistics, Vol. 4, No. 2, 1977, pp. 49-70.</mixed-citation></ref><ref id="scirp.41818-ref3"><label>3</label><mixed-citation publication-type="other" xlink:type="simple">R. C. Blair and J. J. Higgins, “Comparison of the Power of the Paired Samples t test to that of Wilcoxon’s Sign-Ranks Test Under Various Population Shapes,” Psychological Bulletin, Vol. 97, No. 1, 1985, pp. 119-128.http://dx.doi.org/10.1037/0033-2909.97.1.119</mixed-citation></ref><ref id="scirp.41818-ref4"><label>4</label><mixed-citation publication-type="other" xlink:type="simple">H. J. Arnold, “Small Sample Power of the One Sample Wilcoxon Test for Non-Normal Shift Alternatives,” The Annals of Mathematical Statistics, Vol. 36, No. 6, 1965, pp. 1767-1778. http://dx.doi.org/10.1214/aoms/1177699805</mixed-citation></ref><ref id="scirp.41818-ref5"><label>5</label><mixed-citation publication-type="other" xlink:type="simple">R. Randles and D. Wolfe, “Introduction to the Theory of Nonparametric Statistics,” John Wiley &amp; Sons, New York, 1979.</mixed-citation></ref><ref id="scirp.41818-ref6"><label>6</label><mixed-citation publication-type="other" xlink:type="simple">R. C. Blair and J. J. Higgins, “The Power of t and Wilcoxon Statistics: A Comparison,” Evaluation Review, Vol. 4, No. 5, 1980, pp. 645-656. http://dx.doi.org/10.1177/0193841X8000400506</mixed-citation></ref><ref id="scirp.41818-ref7"><label>7</label><mixed-citation publication-type="other" xlink:type="simple">T. A. Gerke and H. A. Randles, “A Method for Resolving Ties in Asymptotic Relative Efficiency,” Statistics and Probability Letter, Vol. 80, No. 13-14, 2010, pp. 1065-1069. http://dx.doi.org/10.1016/j.spl.2010.02.021</mixed-citation></ref><ref id="scirp.41818-ref8"><label>8</label><mixed-citation publication-type="other" xlink:type="simple">W. T. Wiederman and R. W. Alexandrowicz, “A Modified Normal Scores Test for Paired Data,” European Journal of Re- search Methods for the Behavioral and Social Sciences, Vol. 7, No. 1, 2011, pp. 25-38.http://dx.doi.org/10.1027/1614-2241/a000020</mixed-citation></ref><ref id="scirp.41818-ref9"><label>9</label><mixed-citation publication-type="other" xlink:type="simple">S. S. Sawilowsky and G. F. Fahoome, “Statistics through Monte Carlo Simulation with FORTRAN,” Journal of Modern Applied Statistical Methods Inc., Michigan, 2003.</mixed-citation></ref><ref id="scirp.41818-ref10"><label>10</label><mixed-citation publication-type="other" xlink:type="simple">J. Algina, R. C. Blair and W. T. Coombs, “A Maximum Test for Scale: Type I Error Rates and Power,” Journal of Educational and Behavioral Statistics, Vol. 20, No. 1, 1995, pp. 27-39.</mixed-citation></ref><ref id="scirp.41818-ref11"><label>11</label><mixed-citation publication-type="other" xlink:type="simple">L. H. C. Tippett, “The Methods of Statistics,” Williams and Norgate, England, 1934.</mixed-citation></ref><ref id="scirp.41818-ref12"><label>12</label><mixed-citation publication-type="other" xlink:type="simple">“International Mathematical and Statistical Libraries,” IMSL Library, Houston, 1980.</mixed-citation></ref><ref id="scirp.41818-ref13"><label>13</label><mixed-citation publication-type="other" xlink:type="simple">S. S. Sawilowsky, R. C. Blair and J. J. Higgins, “An Investigation of the Type I Error and Power Properties of the Rank Trans- form in Factorial ANOVA,” Communications in Statistics, Vol. 14, No. 3, 1989, pp. 255-267.</mixed-citation></ref><ref id="scirp.41818-ref14"><label>14</label><mixed-citation publication-type="other" xlink:type="simple">R. C. Blair and J. J. Higgins (unpublished, 1992) as referred to in S. S Sawilowsky and G. F. Fahoome, “Statistics through Monte Carlo Simulation with FORTRAN,” Journal of Modern Applied Statistical Methods Inc., Michigan, 2003.</mixed-citation></ref><ref id="scirp.41818-ref15"><label>15</label><mixed-citation publication-type="other" xlink:type="simple">R. J. Simes, “An Improved Bonferroni Procedure for Multiple Tests of Significance,” Biometrika, Vol. 73, No. 3, 1986, pp. 751-754. http://dx.doi.org/10.1093/biomet/73.3.751</mixed-citation></ref><ref id="scirp.41818-ref16"><label>16</label><mixed-citation publication-type="other" xlink:type="simple">Y. Hochberg, “A Sharper Bonferroni Procedure for Multiple Tests of Significance,” Biometrika, Vol. 75, No. 4, 1988, pp. 800-802. http://dx.doi.org/10.1093/biomet/75.4.800</mixed-citation></ref><ref id="scirp.41818-ref17"><label>17</label><mixed-citation publication-type="other" xlink:type="simple">J. D. Gibbons and S. Chakraborti, “Comparisons of the Mann-Whitney, Students t, and Alternate t-Tests for Means of Normal Distributions,” Journal of Experimental Education, Vol. 59, 1991, pp. 258-267.</mixed-citation></ref></ref-list></back></article>