Factorial:
When you need to calculate n!, you have several solutions.
- The “rush” solution: using a loop or a recursive function:
def factorial_for(n): r = 1 for i in range(2, n + 1): r *= i return(r)
def factorial_rec(n): if n > 1: return(n * factorial_rec(n - 1)) else: return(1)
Here, the multiplication of the numbers sequentially will create a huge number very quickly. This is good, but computers are faster when 2 small numbers (120×30240) are involved in a multiplication versus the same multiplication with one huge number (362880×10).
- The “smart” solution: Here, the idea is to split the multiplication sequence in two recursively, taking into account the characteristics of computers.
10! = 1x2x3x4x5x6x7x8x9 x 10 = 362880 x 10 (rush solution) = 1x2x3x4x5 x 6x7x8x9x10 = 120 x 30240 (1st split) = 1x2x3 x 4×5 x 6x7x8 x 9×10 = 6 x 20 x 336 x 90 (2nd split) def range_prod(lo, hi): if lo + 1 < hi: mid = (hi + lo) // 2 return(range_prod(lo, mid) * range_prod(mid + 1, hi)) if lo == hi: return(lo) return(lo * hi)
def factorial_tree(n): if n < 2: return(1) return(range_prod(1, n))
- The “lazy” or “very smart one” (it depends on the point of view): calculating an estimate with Stirling’s approximation .
import math def stirling(n): return(math.sqrt(2 * math.pi * n) * (n / math.e) ** n)
This implementation is limited to n<170, but as you can see below at this number the result is huge. For this reason we don’t work with the factorial directly. We prefer the log factorial.
In [01]: stirling(170) Out[01]: 7.253858934542908e+306 In [02]: stirling(171) Out[02]: inf
Log Factorial:
For the two first solutions, you just need to add math.log() (with an import math):
|
|
def log_fact_for(n):
return(math.log(factorial_for(n)))
And the log factorial with Stirling’s approximation is implemented like this:
def log_stirling(n):
fn = float(n)
fn = n * math.log(fn) + math.log(2 * math.pi * fn) / 2 - fn + \
(fn ** -1) / 12 - (fn ** -3) / 360 + (fn ** -5) / 1260 - \
(fn ** -7) / 1680 + (fn ** -9) / 1188
return(fn)
Note:
In the case where you have multiple log factorials to calculate, it may be useful to save or use previous results (10! = 6!x7x8x9x10) like this:
def log_fact_exact_saved(n, value_saved):
select = max((i for i in value_saved.keys() if i <= n))
if select != n:
value_saved[n] = range_prod(select + 1, n) * value_saved[select]
return(math.log(value_saved[n]))
def log_fact_approx_saved(n, value_saved):
if n not in value_saved:
value_saved[n] = log_stirling(n)
return(value_saved[n])
Test:
Now we can test each solution and compare it with the library math
and scipy
:
import sys
sys.setrecursionlimit(1000000) // increase the depth
import math
import scipy.special
def test_saved(saved, approx):
value = [10, 50, 100, 1000, 1000, 2000, 5000, 5000, 5000,
5000, 10000, 11000, 11000, 11000, 20000, 20000]
if approx:
if saved:
value_saved = {1: 1}
for num in value:
log_fact_approx_saved(num, value_saved)
else:
for num in value:
log_stirling(num)
else:
if saved:
value_saved = {1: 1}
for num in value:
log_fact_exact_saved(num, value_saved)
else:
for num in value:
log_fact_tree(num)
def log_fact_math(n):
return(math.log(math.factorial(n)))
def log_fact_scipy(n, exact=False):
return(math.log(scipy.special.factorial(n, exact)))
import timeit
num = 10000
rep = 200
print("rec:" + str(timeit.timeit("log_fact_rec(" + str(num) + ")",
setup="from __main__ import log_fact_rec", number=rep)))
print("for:" + str(timeit.timeit("log_fact_for(" + str(num) + ")",
setup="from __main__ import log_fact_for", number=rep)))
print("scipy exact:" + str(timeit.timeit("log_fact_scipy(" + str(num) + ", True)",
setup="from __main__ import log_fact_scipy", number=rep)))
print("math:" + str(timeit.timeit("log_fact_math(" + str(num) + ")",
setup="from __main__ import log_fact_math", number=rep)))
print("tree:" + str(timeit.timeit("log_fact_tree(" + str(num) + ")",
setup="from __main__ import log_fact_tree", number=rep)))
print("scipy approx:" + str(timeit.timeit("log_fact_scipy(" + str(num) + ")",
setup="from __main__ import log_fact_scipy", number=rep)))
print("log_stirling:" + str(timeit.timeit("log_stirling(" + str(num) + ")",
setup="from __main__ import log_stirling", number=rep)))
print("value log(100!) tree:" + str(log_fact_tree(100)))
print("value log(100!) scipy approx:" + str(log_fact_scipy(100)))
print("value log(100!) log_stirling:" + str(log_stirling(100)))
print("severals factorial:" + str(timeit.timeit("test_saved(False, False)",
setup="from __main__ import test_saved", number=rep)))
print("severals factorial saved:" + str(timeit.timeit("test_saved(True, False)",
setup="from __main__ import test_saved", number=rep)))
print("severals factorial approx:" + str(timeit.timeit("test_saved(False, True)",
setup="from __main__ import test_saved", number=rep)))
print("severals factorial approx saved:" + str(timeit.timeit("test_saved(True, True)",
setup="from __main__ import test_saved", number=rep)))
With python 2.7.6
and scipy 0.17.0
, we have:
rec:4.52898597717 for:4.09089803696 scipy exact:4.08975815773 math:4.02955198288 tree:1.2138531208 |
You can see that the “smart” solution is faster, even versus scipy |
scipy approx:0.00166201591492 log_stirling:0.000301837921143 |
Again scipy is slowest |
value log(100!) tree:363.739375556 value log(100!) scipy approx:363.739375556 value log(100!) log_stirling:363.739375556 |
The log transformation estimate simulations are equal with n>=100 |
several factorials:14.3231370449 several factorials saved:3.51164102554 several factorials approx:0.00481510162354 several factorials approx saved:0.00360107421875 |
Saving previous results can save a lot of time, but it will take more memory. |
I hope this blog post will help you to choose the best solution to compute factorial or log factorial, according to your needs.
Leave A Comment