About the GPEMjournal blog

This is the editor's blog for the journal Genetic Programming and Evolvable Machines. The official web site for the journal, maintained by the publisher (Springer) is here. The GPEMjournal blog is authored and maintained by Lee Spector.

Wednesday, August 31, 2011

95% confident = 950/1000 for any distribution

A technique I have seen used for statistical confidence testing of non-Gaussian distributions is to generate 1000 random examples of the distribution. If the you want to be 95% confident that answer to be checked comes from the same distribution then it should be "like" 950 of the 1000 examples.

Eg if the distribution is reasonably well behaved then if the answer to be checked lies outside the range of the 25th to 975th example we can say we confidently reject the null hypothesis and say our answer is not from the distribution used to generate the 1000 examples. We do not need Z-scores, t-tests etc.

This non-parametric test should be ok with any distribution. We are effectively burning CPU cyles rather than spending brain cycles on devising and validating a statistical technique specifically for our new distribution.

3 comments:

  1. Bill, that is the Bootstrap you describe, invented by Bradley Efron in 1977. Bringing in the Bootstrap was my contribution to a 1985 paper with Larry Mueller in Ecology. By supposing that the sample distribution is the real distribution, by i.i.d. resampling of it, you can estimate not only confidence intervals, but bias in a statistic - if the statistic on your resampled data averages a value E different from that statistic on your original data, then you can guess that it is a distance of 2E from the statistic on the REAL distribution. -- Lee Altenberg http://dynamics.org/Altenberg/

    ReplyDelete
  2. This comment has been removed by a blog administrator.

    ReplyDelete
  3. This comment has been removed by a blog administrator.

    ReplyDelete