# Factorial and Log Factorial

## Factorial:

When you need to calculate n!, you have several solutions.

1. The “rush” solution: using a loop or a recursive function:
 def factorial_for(n): r = 1 for i in range(2, n + 1): r *= i return(r)  def factorial_rec(n): if n > 1: return(n * factorial_rec(n - 1)) else: return(1)

Here, the multiplication of the numbers sequentially will create a huge number very quickly. This is good, but computers are faster when 2 small numbers (120×30240) are involved in a multiplication versus the same multiplication with one huge number (362880×10).

2. The “smart” solution: Here, the idea is to split the multiplication sequence in two recursively, taking into account the characteristics of computers.
 10! = 1x2x3x4x5x6x7x8x9 x 10 = 362880 x 10 (rush solution) = 1x2x3x4x5 x 6x7x8x9x10 = 120 x 30240 (1st split) = 1x2x3 x 4×5 x 6x7x8 x 9×10 = 6 x 20 x 336 x 90 (2nd split)
def range_prod(lo, hi):
if lo + 1 < hi:
mid = (hi + lo) // 2
return(range_prod(lo, mid) * range_prod(mid + 1, hi))
if lo == hi:
return(lo)
return(lo * hi)

def factorial_tree(n):
if n < 2:
return(1)
return(range_prod(1, n))
3. The “lazy” or “very smart one” (it depends on the point of view): calculating an estimate with Stirling’s approximation .
import math
def stirling(n):
return(math.sqrt(2 * math.pi * n) * (n / math.e) ** n)

This implementation is limited to n<170, but as you can see below at this number the result is huge. For this reason we don’t work with the factorial directly. We prefer the log factorial.

In [01]: stirling(170)
Out[01]: 7.253858934542908e+306
In [02]: stirling(171)
Out[02]: inf


## Log Factorial:

For the two first solutions, you just need to add math.log() (with an import math):

 def log_fact_rec(n): return(math.log(factorial_rec(n))) def log_fact_tree(n): return(math.log(factorial_tree(n)))
def log_fact_for(n):
return(math.log(factorial_for(n)))

And the log factorial with Stirling’s approximation is implemented like this:

def log_stirling(n):
fn = float(n)
fn = n * math.log(fn) + math.log(2 * math.pi * fn) / 2 - fn + \
(fn ** -1) / 12 - (fn ** -3) / 360 + (fn ** -5) / 1260 - \
(fn ** -7) / 1680 + (fn ** -9) / 1188
return(fn)

Note:
In the case where you have multiple log factorials to calculate, it may be useful to save or use previous results (10! = 6!x7x8x9x10) like this:

def log_fact_exact_saved(n, value_saved):
select = max((i for i in value_saved.keys() if i <= n))
if select != n:
value_saved[n] = range_prod(select + 1, n) * value_saved[select]
return(math.log(value_saved[n]))

def log_fact_approx_saved(n, value_saved):
if n not in value_saved:
value_saved[n] = log_stirling(n)
return(value_saved[n])

## Test:

Now we can test each solution and compare it with the library math and scipy:


import sys
sys.setrecursionlimit(1000000) // increase the depth
import math
import scipy.special
def test_saved(saved, approx):
value = [10, 50, 100, 1000, 1000, 2000, 5000, 5000, 5000,
5000, 10000, 11000, 11000, 11000, 20000, 20000]
if approx:
if saved:
value_saved = {1: 1}
for num in value:
log_fact_approx_saved(num, value_saved)
else:
for num in value:
log_stirling(num)
else:
if saved:
value_saved = {1: 1}
for num in value:
log_fact_exact_saved(num, value_saved)
else:
for num in value:
log_fact_tree(num)

def log_fact_math(n):
return(math.log(math.factorial(n)))

def log_fact_scipy(n, exact=False):
return(math.log(scipy.special.factorial(n, exact)))

import timeit
num = 10000
rep = 200
print("rec:" + str(timeit.timeit("log_fact_rec(" + str(num) + ")",
setup="from __main__ import log_fact_rec", number=rep)))
print("for:" + str(timeit.timeit("log_fact_for(" + str(num) + ")",
setup="from __main__ import log_fact_for", number=rep)))
print("scipy exact:" + str(timeit.timeit("log_fact_scipy(" + str(num) + ", True)",
setup="from __main__ import log_fact_scipy", number=rep)))
print("math:" + str(timeit.timeit("log_fact_math(" + str(num) + ")",
setup="from __main__ import log_fact_math", number=rep)))
print("tree:" + str(timeit.timeit("log_fact_tree(" + str(num) + ")",
setup="from __main__ import log_fact_tree", number=rep)))

print("scipy approx:" + str(timeit.timeit("log_fact_scipy(" + str(num) + ")",
setup="from __main__ import log_fact_scipy", number=rep)))
print("log_stirling:" + str(timeit.timeit("log_stirling(" + str(num) + ")",
setup="from __main__ import log_stirling", number=rep)))

print("value log(100!) tree:" + str(log_fact_tree(100)))
print("value log(100!) scipy approx:" + str(log_fact_scipy(100)))
print("value log(100!) log_stirling:" + str(log_stirling(100)))

print("severals factorial:" + str(timeit.timeit("test_saved(False, False)",
setup="from __main__ import test_saved", number=rep)))
print("severals factorial saved:" + str(timeit.timeit("test_saved(True, False)",
setup="from __main__ import test_saved", number=rep)))
print("severals factorial approx:" + str(timeit.timeit("test_saved(False, True)",
setup="from __main__ import test_saved", number=rep)))
print("severals factorial approx saved:" + str(timeit.timeit("test_saved(True, True)",
setup="from __main__ import test_saved", number=rep)))


With python 2.7.6 and scipy 0.17.0, we have:

 rec:4.52898597717 for:4.09089803696 scipy exact:4.08975815773 math:4.02955198288 tree:1.2138531208 You can see that the “smart” solution is faster, even versus scipy scipy approx:0.00166201591492 log_stirling:0.000301837921143 Again scipy is slowest value log(100!) tree:363.739375556 value log(100!) scipy approx:363.739375556 value log(100!) log_stirling:363.739375556 The log transformation estimate simulations are equal with n>=100 several factorials:14.3231370449 several factorials saved:3.51164102554 several factorials approx:0.00481510162354 several factorials approx saved:0.00360107421875 Saving previous results can save a lot of time, but it will take more memory.

I hope this blog post will help you to choose the best solution to compute factorial or log factorial, according to your needs.

By | 2017-04-29T15:33:07+00:00 February 22, 2016|Categories: Performance, Python|Tags: |0 Comments