Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use Stirling's approximation to expand range of Poisson pmf #1

Merged
merged 1 commit into from Oct 31, 2012
Merged

Use Stirling's approximation to expand range of Poisson pmf #1

merged 1 commit into from Oct 31, 2012

Conversation

run4flat
Copy link
Member

The Poisson distribution is well behaved for large input values,
but the factorial form is not numerically stable and has a hard-coded
limit for its input values. Using Stirling's Approximation for the
logarithm of the factorial, we can rewrite the probability as a
sum of the logarithm of its terms, which is numerically stable.

This commit includes two new Poisson distribution functions as well
as modifications to the current distribution function. The factorial
form is called pmf_poisson_factorial and is a slight variant on the
original code that sets all probabilities for large input x to zero.
Perhaps a value of -1 would be better. The Stirling form is called
pmf_poisson_stirling and uses Stirling's approximation for all
values of x. This leads to numerically inaccurate results for very
small values of the input argument (though the inaccuracy is minor).
The Combined form is called pmf_poisson and uses the factorial form
for small values of x (roughly < 85) and the Stirling form for large
values of x.

Results from the Stirling and factorial forms differ, but that
difference falls within numerical bit noise for x roughly at
50.

This commit does not include tests. It is intended to demonstrate
the different forms. If accepted, tests should be fairly easy and
should follow shortly.

The Poisson distribution is well behaved for large input values,
but the factorial form is not numerically stable and has a hard-coded
limit for its input values. Using Stirling's Approximation for the
logarithm of the factorial, we can rewrite the probability as a
sum of the logarithm of its terms, which is numerically stable.

This commit includes two new Poisson distribution functions as well
as modifications to the current distribution function. The factorial
form is called pmf_poisson_factorial and is a slight variant on the
original code that sets all probabilities for large input x to zero.
Perhaps a value of -1 would be better. The Stirling form is called
pmf_poisson_stirling and uses Stirling's approximation for all
values of x. This leads to numerically inaccurate results for very
small values of the input argument (though the inaccuracy is minor).
The Combined form is called pmf_poisson and uses the factorial form
for small values of x (roughly < 85) and the Stirling form for large
values of x.

Results from the Stirling and factorial forms differ, but that
difference falls within numerical bit noise for x roughly at
50.

This commit does not include tests. It is intended to demonstrate
the different forms. If accepted, tests should be fairly easy and
should follow shortly.
maggiexyz added a commit that referenced this pull request Oct 31, 2012
Use Stirling's approximation to expand range of Poisson pmf
@maggiexyz maggiexyz merged commit 33878e0 into PDLPorters:master Oct 31, 2012
@run4flat
Copy link
Member Author

Maggie, be sure to remind me to write those tests. They will be numerical, ensuring that the two forms agree where the overlap is expected to work. I'm busy for the next couple of days, but don't let me forget!

@mohawk2
Copy link
Member

mohawk2 commented Aug 18, 2015

@run4flat Please write those tests :-)

@zmughal
Copy link
Member

zmughal commented Aug 18, 2015

Hi @run4flat, I know you're really busy these days.

If you want, I can write the tests if you give me a description of what you planned and then I'll ping you for a code review.

@run4flat
Copy link
Member Author

@zmughal, the idea was to ensure that this comment is correct:

Results from the Stirling and factorial forms differ, but that
difference falls within numerical bit noise for x roughly at
50.

Basic test:

  1. Run pmf_poisson_factorial(sequence(1000) + 1) and look for the index of the first zero. The index value (plus 1) will give you the value of GSL_SF_FACT_NMAX.
  2. Create a sequence from 50 to GSL_SF_FACT_NMAX
  3. Run pmf_poisson_factorial and pmf_poisson_stirling on said sequence.
  4. Ensure that the returned results are indeed within 'bit noise' of each other by checking that their relative difference is small compared to some pre-determined tolerance.

@run4flat
Copy link
Member Author

BTW, the logic for determining GSL_FS_FACT_NMAX is based on this line: 33878e0#diff-50756f500629aacd2997df957161b91cR1153

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants