Performance

Simple multiprocessing in R

Continuing my effort to help you get the most out of your CPUs, I figured we could look into using some multiprocessing functionality available for your R scripts. While there are a few different options for running multi-core treatments on your data, we'll focus on something really simple to put in place. A while back, I was putting together a script to run a large series of logistic regressions (using the glm package) in an attempt to model some data. [...]

By | 2016-03-14T15:40:03+00:00 March 14, 2016|Categories: Performance, R|Tags: |0 Comments

Factorial and Log Factorial

Factorial: When you need to calculate n!, you have several solutions.  The "rush" solution: using a loop or a recursive function:  def factorial_for(n): r = 1 for i in range(2, n + 1): r *= i return(r) def factorial_rec(n): if n > 1: return(n * factorial_rec(n - 1)) else: return(1) Here, the multiplication of the numbers sequentially will create a huge number very quickly. This is good, but computers are faster when 2 small numbers (120x30240) are involved in a multiplication versus the [...]

By | 2016-11-08T09:30:06+00:00 February 22, 2016|Categories: Bioinformatics, Data Analysis, Performance, Python|0 Comments

[Python] Iterators vs Generators

In Python, there are iterators and generators. You probably already use iterators without even knowing that you do so. But understanding the difference between those two concepts is really important since choosing one over the other has a huge impact on memory usage. If you are working with small datasets, memory usage might not be your first concern. However, with big datasets, it is another story. So what are they exactly, iterators and generators? Iterators The process of going through [...]

By | 2015-09-18T09:36:54+00:00 September 18, 2015|Categories: Bioinformatics, Performance, Python, Uncategorized|0 Comments

Put Those CPUs to Good Use !

If you're like me, you've probably noticed that, by default, the python scripts we write only use a portion of the processing power at our disposal.. As such, you've probably said to yourself: Hey, I paid good money for a quad-core CPU ! What's happening ? While it's true that nowadays, most CPUs are multi-core, the code we write must also be tailored appropriately in order to make use of more than one at a time. So let's dive into [...]

By | 2017-04-12T11:53:19+00:00 July 12, 2015|Categories: Performance, Python|Tags: |0 Comments

Be better at programming with static program analysis

- What is static program analysis ? Static program analysis allows the gathering of informations about the execution behaviour of your code without actually executing it. It is the opposite of dynamic program analysis (like debugging) which required the code to be executed. - Ok! But why should I use this in practice ? To save time by suppressing the save/execute cycles induced by syntax errors (missing ";", function or variable not initialized, typos, ...). Correcting these errors at the [...]

By | 2016-11-08T09:30:14+00:00 May 8, 2015|Categories: Performance, Python, R, Web development|0 Comments