ASPN ActiveState Programmer Network
ActiveState
/ Home / Perl / PHP / Python / Tcl / XSLT /
/ Safari / My ASPN /
Cookbooks | Documentation | Mailing Lists | Modules | News Feeds | Products | User Groups


Recent Messages
List Archives
About the List
List Leaders
Subscription Options

View Subscriptions
Help

View by Topic
ActiveState
.NET Framework
Open Source
Perl
PHP
Python
Tcl
Web Services
XML & XSLT

View by Category
Database
General
SOAP
System Administration
Tools
User Interfaces
Web Programming
XML Programming


MyASPN >> Mail Archive >> scipy-dev
scipy-dev
Re: [SciPy-dev] scipy.stats.sem is wrong
by David Huard other posts by this author
Nov 15 2006 6:23AM messages near this date
[SciPy-dev] scipy.stats.sem is wrong | [SciPy-dev] LAPACK 3.1
Roman,

A couple of months ago was Statistical Review Month, where users and devs
were asked to look at functions froms stats, weed out the duplicates, add
docstrings, etc. If I remember correctly, at the end of the month,
unreviewed functions were to be stored in the sandbox (a good incentive if
you ask me).  The work is started (thanks to Robert), but it's not over. If
you want to have a go at it, look at the scipy trac site, there are dozens
of open tickets for statistical functions. That's also the place to submit
patches.

http://projects.scipy.org/scipy/scipy/report/8

Regards,
David




2006/11/15, Roman Bertle <bertle@[...].org> :
> 
>  Hello,
> 
>  i think scipy.stats.sem is wrong. It gives the same result as
>  scipy.stats.stderr (using N-1 and not N), whereas scipy.stats.tsem
>  uses N and gives the correct result. I have attached a patch correcting
>  this.
> 
>  Related to this, i wonder why there are so many related functions in
>  scipy.stats doing the same, but in a slightly different way. E.g. there
>  are nanstd, std, tstd, some use numpy.std, some not, some take an axis
>  argument, some not. And there is samplestd and samplevar, but sampleerr
>  is called sem instead. Shouldn't these functions be unified somehow?
> 
>  Regards,
> 
>  Roman
>  -------------------------
>  diff -rud python-scipy-0.5.1/Lib/stats/stats.py python-scipy-0.5.1-new
>  /Lib/stats/stats.py
>  --- python-scipy-0.5.1/Lib/stats/stats.py       2006-08-29 11:58:
>  37.000000000 +0200
>  +++ python-scipy-0.5.1-new/Lib/stats/stats.py   2006-11-15 12:18:
>  23.000000000 +0100
>  @@ -1166,9 +1166,7 @@
>  integer (the axis over which to operate)
>  """
>       a, axis = _chk_asarray(a, axis)
>  -    n = a.shape[axis]
>  -    s = samplestd(a,axis) / sqrt(n-1)
>  -    return s
>  +    return samplestd(a,axis) / float(sqrt(a.shape[axis]))
> 
> 
>  def z(a, score):
>  diff -rud python-scipy-0.5.1/Lib/stats/tests/test_stats.py
>  python-scipy-0.5.1-new/Lib/stats/tests/test_stats.py
>  --- python-scipy-0.5.1/Lib/stats/tests/test_stats.py    2006-08-29 11:58:
>  37.000000000 +0200
>  +++ python-scipy-0.5.1-new/Lib/stats/tests/test_stats.py        2006-11-15
>  12:11:29.000000000 +0100
>  @@ -740,15 +740,16 @@
>  ##        assert_approx_equal(y,0.775177399)
>           y = scipy.stats.stderr(self.testcase)
>           assert_approx_equal(y,0.6454972244)
>  +
>       def check_sem(self):
>           """
>           this is not in R, so used
>  -        sqrt(var(testcase)*3/4)/sqrt(3)
>  +        sqrt(samplevar(testcase))/sqrt(4)
>           """
>           #y = scipy.stats.sem(self.shoes[0])
>           #assert_approx_equal(y,0.775177399)
>           y = scipy.stats.sem(self.testcase)
>  -        assert_approx_equal(y,0.6454972244)
>  +        assert_approx_equal(y,0.5590169944)
> 
>       def check_z(self):
>           """
>  -------------------------
>  _______________________________________________
>  Scipy-dev mailing list
>  Scipy-dev@[...].org
>  http://projects.scipy.org/mailman/listinfo/scipy-dev
> 
Thread:
Roman Bertle
David Huard

Privacy Policy | Email Opt-out | Feedback | Syndication
© ActiveState Software Inc. All rights reserved