python and pandas

R is undeniably a must-use language. Especially for data visualization. But R can sometimes be a little bit slow when dealing with big datasets. If you don't need to create awesome graphs or don't have time to wait, there's an alternative in Python that can be quite fast for data manipulation. The Python Data Analysis Library, pandas, provides an easy way to manipulate data in python. Recently, I had to deal with a big gene expression file (21024 genes x [...]

By | April 17, 2014|Categories: Python|Tags: , |1 Comment

What’s the fastest?

Often, we rely on our old habits. We get comfortable and have a tendency to do things the same old way. Same thing happens when you're programming. But a day will come when you’ll ask yourself, is this the fastest way to perform this task ? And when this happens to you (and if the given task is in Python), you’ll be glad that a package like timeit exist. Sure there are other ways to organize timing contest in Python. [...]

By | April 2, 2014|Categories: Performance, Python|0 Comments

lifelines (or doing survival analysis in Python)

Lately, I've been doing survival analysis.  I'm not an expert but we had a self-learning group based on David G. Kleinbaum and Mitchel Klein’s  book,   "Survival Analysis. A Self-Learning Text" .  At the end of this book, there's code provided to help you get started in SAS, Stata, SPSS and... R!  I've played with the R package survival which is quite good!  My problem was that I wanted to do survival analysis in Python.  I've started by doing it with [...]

By | March 24, 2014|Categories: Python, Statistics|Tags: |0 Comments