Performance

Parallelize your Python !

This article will teach you what are multithreads, multicores, and in what circumstances each can be used. Your nerd friend keeps telling you about his professional deformation all the time? Wanting to parallelize and optimize his time? Do you wish to understand it as well and save time by parallelizing your programs in Python? Then this article is what you need! You will be able to gain big amounts of time, thanks to a small dose of parallelism [...]

By | 2017-04-29T15:33:41+00:00 March 29, 2016|Categories: Performance, Python|Tags: , |0 Comments

Simple multiprocessing in R

Continuing my effort to help you get the most out of your CPUs, I figured we could look into using some multiprocessing functionality available for your R scripts. While there are a few different options for running multi-core treatments on your data, we'll focus on something really simple to put in place. A while back, I was putting together a script to run a large series of logistic regressions (using the glm package) in an attempt to model some data. [...]

By | 2017-04-29T15:33:26+00:00 March 14, 2016|Categories: Performance, R|Tags: |1 Comment

Factorial and Log Factorial

Factorial: When you need to calculate n!, you have several solutions.  The "rush" solution: using a loop or a recursive function:  def factorial_for(n): r = 1 for i in range(2, n + 1): r *= i return(r) def factorial_rec(n): if n > 1: return(n * factorial_rec(n - 1)) else: return(1) Here, the multiplication of the numbers sequentially will create a huge number very quickly. This is good, but computers are faster when 2 small numbers (120x30240) are involved in a multiplication versus the [...]

By | 2017-04-29T15:33:07+00:00 February 22, 2016|Categories: Performance, Python|Tags: |0 Comments

What’s the fastest? – R edition

When I started using R, about ten years ago, the community was much smaller. No R-bloggers to get inspired or ggplot2 to make nice graphs. It was the beginning of an other implementation of R (other than CRAN's) known as Revolution R from Revolution Analytics. Their R targeted enterprise and was designed to be faster and more scalable. They also offer an open source version of their product called RRO. In April 2015, the company was acquired by Microsoft! May [...]

By | 2017-04-29T15:32:29+00:00 February 12, 2016|Categories: Performance, R|0 Comments

[Python] Iterators vs Generators

In Python, there are iterators and generators. You probably already use iterators without even knowing that you do so. But understanding the difference between those two concepts is really important since choosing one over the other has a huge impact on memory usage. If you are working with small datasets, memory usage might not be your first concern. However, with big datasets, it is another story. So what are they exactly, iterators and generators? Iterators The process of going through [...]

By | 2017-04-29T15:37:35+00:00 September 18, 2015|Categories: Performance, Python|0 Comments