CoHort Software |
Nonparametric Tests in CoStat
Percentiles, Rank Correlation,
Runs Tests, and ANOVAs)
Most statistical procedures in CoStat (including
Statistics : Correlation,
Statistics : Descriptive,
and parts of Statistics : Frequency Analysis
and Statistics : Miscellaneous) assume
that the data is normally distributed. Sometimes there are other assumptions;
for example, standard ANOVAs assume that the
variances of the subgroups are homogeneous. These assumptions
allow the tests to make powerful inferences about the data.
For some datafiles, the assumptions are not valid. Several other tests have been
devised ("nonparametric" tests) which do not make assumptions about the
distribution of the data. Most of these tests rank the data and then
do statistical tests with the ranked values. These tests are generally not
as powerful (that is, not as good at rejecting the null hypothesis) as
the traditional tests, but they are very useful when you can't use the
Unfortunately, there aren't replacement nonparametric tests for all
of the traditional tests. CoStat has these options (on the
Statistics : Nonparametric menu):
- Percentiles -
calculates nonparametric descriptive statistics: mode and percentiles.
- Rank Correlation -
Kendall's and Spearman's tests are analogous to the
Pearson product moment correlation coefficient.
- Runs Tests -
2 Runs Tests: Up and Down, and Above and Below the Median
- Tied Ranks -
This ranks the values in a column, replaces ties with the average rank,
then inserts a new column with the tied rank values.
- 1 Way, Completely Randomized ANOVA -
the Kruskal-Wallis Test.
- 1 Way, 2 Treatment, Completely Randomized ANOVA -
Mann-Whitney U-test and Wilcoxon Two Sample Test.
- 1 Way, Randomized Blocks ANOVA -
Friedman's Method for Randomized Blocks.
- 1 Way, 2 Treatment, Randomized Blocks ANOVA -
Wilcoxon's Signed-Ranks Test for Two Groups.
Nonparametric Tests in the CoStat Manual
CoStat's manual has:
The sample runs show how to do 8 different types of
nonparametric tests. Here is sample run #2:
- An introduction to nonparametric testing.
- A description of the calculation methods that are used by the program.
- 8 complete sample runs.
Statistics : Nonparametric : Rank Correlation
Correlation is a measure of the linear association of
two independent variables (X1 and X2).
This procedure is analogous to the
Pearson product moment correlation coefficient, but it works
with the ranks of the values in each column, so it makes
no assumptions about the distribution of the values.
Read the general description of
Statistics : Nonparametric (page 333).
Statistics : Correlation (page 275)
calculates the Pearson product moment correlation coefficient.
See Sokal and Rohlf (1981 and 1995)
"Box 15.6 (1981) (or Box 15.7, 1995) Kendall's Coefficient of Rank
Correlation, tau" and
"Section 15.8 (1981 or 1995) Nonparametric for association"
(for Spearman's Coefficient of Rank Correlation).
The data file must have two or more columns.
The correlation of all pairs of
columns will be tested for the whole data file. Missing values
(NaN's, page 70)
are allowed; only missing values of either of the two columns currently
being tested cause rejection of the row of data.
- Choose the first data column.
- Choose the second data column.
- Keep If:
- lets you enter a boolean expression (for example,
(col(1)>50) and (col(2)<col(3))).
Each row of the data file is tested. If the equation evaluates to
true, that row of data will be used in the calculations.
If false, that row of data will be ignored. See
"Using Equations" (page 66).
- This leads to a list of characters (#32 to #255,
as defined by the ISO 8859-1 Character Encoding).
If you click on a character,
it will be inserted into the equation at the current insertion point.
- The f() button leads to a list of built-in functions
and other parts of equations.
If you click on an item, it will be inserted
into the equation at the current insertion point.
The list includes:
See "Using Equations" (page 66).
- Data file column numbers and names (for example,
"col(3) Height") -
so you can refer to values in
various columns in the data file.
Note that equations shouldn't refer to column names,
for example ("col(3)" is inserted,
not "col(3) Height").
- Built-in Functions (for example, "sin(x) d") -
for the functions are described tersely, but basically:
b=any boolean expression,
d=any numeric (double) expression,
i=any integer expression,
s=any string expression,
and v=void (no return value).
The letter at the end of the function's signature
indicates the type of the return value.
- Constants (for example, "pi").
- Operators (for example, "*").
- Run the procedure.
- Close the dialog box.
For both the Kendall and Spearman correlation tests, the test
statistics are similar to the product moment correlation coefficient,
r, and range from -1 to 1.
If n>40, the significance
of Kendall's tau can be tested by calculating a test statistic,
ts, which the procedure compares to tabulated values of
Student's t distribution:
ts = tau / sqrt(2*(2*n+5)/(9*n*(n-1)))
where n is the number of data pairs.
If n>10, the significance of Spearman's r can be tested by
calculating a test statistic, ts, which the procedure
compares to tabulated values of Student's t distribution:
ts = r / sqrt( (1-r^2) / (n-2) )
If n<=10, Spearman's r must be compared to tabular values which
are not included with CoStat, but can be found in
Sokal and Rohlf (1995).
The Sample Run
for the sample run is from
Sokal and Rohlf
(Box 15.6, 1981; or Box 15.7, 1995): "Computation of
rank correlation coefficient between the total length (Y1)
of 15 aphid stem mothers and the mean thorax length (Y2) of
their parthenogenetic offspring."
First Column: 1) Y1
Last Column: 2) Y2
First Row: 1
Last Row: 15
For the sample run, use File : Open to open the file called
in the cohort directory and specify:
- From the menu bar, choose: Statistics : Nonparametric : Rank Correlation
- X1: 1) Y1
- X2: 2) Y2
- Keep If:
RANK CORRELATION (Kendall and Spearman Tests)
Y1 Column: 1) Y1
Y2 Column: 2) Y2
The test statistics, Kendall's tau and Spearman's r, are similar to
the product moment correlation coefficient, r, ranging from -1 to 1.
If the sample size is large enough (n>40 for tau and n>10 for r),
additional test statistics can be calculated and compared to
Student's t distribution (two-tailed, df=infinity). Otherwise, see
specially tabulated critical values of tau in Table S in 'Statistical
Tables' (F.J. Rohlf and R.R. Sokal, 1995).
If P<=0.05, tau or r is significantly different from 0 and the values
in the two columns probably are correlated.
Y1 column: 1) Y1
Y2 column n Kendall tau P Spearman r P
------------------- ------- ------------- --------- ------------- ---------
2) Y2 15 0.49761335153 (n<=40) 0.64910714286 .0088 **
P is the probability that the variates are not correlated.
The low P value (<=0.05) for this data set indicates that the two variates
probably are correlated.
CoHort Software |
CoStat Statistics |