Bioinformatics

SciPy and Logistic Regressions

Given a set of data points, we often want to see if there exists a satisfying relationship between them. Linear regressions can easily be visualized with Seaborn, a Python library that is meant for exploration and visualization rather than statistical analysis. As for logistic regressions, SciPy is a good tool when one does not have his or her own analysis script. Let's look at the optimize package                        from scipy.optimize import [...]

By | June 9, 2016|Categories: Bioinformatics, Data Analysis, Data Visualization, Python|0 Comments

The language(s) of bioinformatics

The most recurrent question I get regarding bioinformatics is unfortunately the one that leads to the least productive discussions I've participated in: Which programming language should I use for bioinformatics? Don't get me wrong, in a pub, over a beer, this can lead to some lively entertainment among the nerd intelligentsia... but rarely does it lead to enlightenment that persists in the morning! Here, I'd like to share the current answer I have honed over the past years. It is based [...]

By | April 18, 2016|Categories: Bioinformatics|0 Comments

Parallelize your Python !

This article will teach you what are multithreads, multicores, and in what circumstances each can be used. Your nerd friend keeps telling you about his professional deformation all the time? Wanting to parallelize and optimize his time? Do you wish to understand it as well and save time by parallelizing your programs in Python? Then this article is what you need! You will be able to gain big amounts of time, thanks to a small dose of parallelism [...]

By | March 29, 2016|Categories: Bioinformatics, Performance, Python|Tags: |0 Comments

Factorial and Log Factorial

Factorial: When you need to calculate n!, you have several solutions.  The "rush" solution: using a loop or a recursive function:  def factorial_for(n): r = 1 for i in range(2, n + 1): r *= i return(r) def factorial_rec(n): if n > 1: return(n * factorial_rec(n - 1)) else: return(1) Here, the multiplication of the numbers sequentially will create a huge number very quickly. This is good, but computers are faster when 2 small numbers (120x30240) are involved in a multiplication versus the [...]

By | February 22, 2016|Categories: Bioinformatics, Data Analysis, Performance, Python|0 Comments

Beginner R: functions that make your life easier

Let’s get to know my top 10 R’s neat little functions and tricks that make our life easier when manipulating data in R. Sequences Want to make long sequences of numbers or letters but don’t feel like writing them all out into a vector? R let’s you make a sequence with “:” for numbers. You can also use seq() if you are looking for a regular sequence that is not incremented by one. letters[] let’s you make continuous letter sequences, [...]

By | January 28, 2016|Categories: Bioinformatics, R|0 Comments