Permutation test
While working on my thesis I needed to calculate the statistical significance of differences in performance between a couple classifiers and information retrieval methods that I was comparing. One of the limitations I was working under were that the size of my gold standard was very limited (~100 items), so running multiple resampled tests and using t-test or something similar didn’t seem like a viable option. I came across the following papers describing the use of the Permutation test:
Mark D. Smucker, James Allan, and Ben Carterette. 2007. A comparison of statistical significance tests for information retrieval evaluation. In Proceedings of the sixteenth ACM conference on Conference on information and knowledge management (CIKM ’07). ACM, New York, NY, USA, 623-632. DOI=10.1145/1321440.1321528 http://doi.acm.org/10.1145/1321440.1321528
Alexander Yeh. 2000. More accurate tests for the statistical significance of result differences. InProceedings of the 18th conference on Computational linguistics – Volume 2, Vol. 2. Association for Computational Linguistics, Morristown, NJ, USA, 947-953. DOI=10.3115/992730.992783 http://dx.doi.org/10.3115/992730.992783
The method seemed like exactly what I was looking for, unfortunately I could not find an implementation that was easily assessable and usable. So I rolled my own. If you find this useful please let me know. More importantly, let me know if you find any errors, or if you have any suggestions.
You can download the code at: RandomPermutation.php