332x Filetype PPTX File size 1.19 MB Source: ecs.wgtn.ac.nz
Why Statistical Significance Test
• Suppose we have developed an EC algorithm A
• We want to compare with another EC algorithm B
• Both algorithms are stochastic
• How can we be sure that A is better than B?
• Assume we run A and B once, and get the results x
and y, respectively.
• If x < y (minimisation), is it because A is better than
B, or just because of randomness?
2
Why Statistical Significance Test
• Treat a stochastic algorithm as a random number
generator, and its output follows some distribution
• The random output depends on the algorithm and
random seed
• Collect samples: run algorithms many times
independently (using different random seeds)
• Carry out statistical significance tests based on the
collected samples
3
Statistical Significance Test
• Parametric/Non-parametric: assume/do not assume
the random variables follow normal distribution
• Paired:
Unpaired Paired
Parametric T-test/z-test Paired t-test
Non-parametric Wilcoxon rank sum Wilcoxon signed rank
4
One-sample z-test
•• T he z-test is used when
• Test the population mean using
– The sample mean
– The sample standard deviation (σ)
– The number of samples
z < -2 z > 2
5
One-sample z-test
• (Null) hypothesis:
• Reject the hypothesis if the samples do not support
it statistically (z < -2 or z > 2 under significance level
of 0.05. Note: the exact critical value is 1.96 at 0.05
significance level. We use 2 as a rough value.)
• P-value
– for two-tailed
– for lower-tailed
– for upper-tailed
• Reject the hypothesis if p-value < significance level
6
no reviews yet
Please Login to review.