Think like a computer

Let's say all your results for a given project are stored in Excel files named exp1.xlsx, exp2_20170708.xlsx, exp_prolif_072017.xlsx... Inside file exp1.xlsx, you have this : This might be a user-friendly result file but it is not "computer-friendly" file. Let's suppose that you (or your boss) decide that you now need a database instead of the twenty-six different Excel files you have been using to store results. If all your files are similar to exp1.xlsx, you will have to put a [...]

By | 2018-02-08T13:32:14+00:00 February 8, 2018|Categories: Bioinformatics, Biology|0 Comments

A multiprocessing example and more

Recently, I had to search a given chemical structure into a list of structures. Using the python chemoinformatics packages pybel and rdkit, I was easily able to do so but the operation took a little too much time for my linking. Wondering how I could search faster, I immediately thought about Jean-Philippe's previous blog post titled Put Those CPUs to Good Use. I've decided to follow his instructions and give it a try. Goal Look for a molecule (a given [...]

By | 2017-12-11T12:55:55+00:00 December 11, 2017|Categories: Bioinformatics, Computer science, Performance|0 Comments

Overfitting and Regularization

This series of articles on machine learning wouldn't be complete without dipping our toes in overfitting and regularization. Overfitting The Achille's heel of machine learning is overfitting. As machine learning techniques get more and more powerful (large number of parameters), exposure to overfitting increases. In the context of an overfit, the model violates Occam's razor's principle by generating a model so complex that it begins to memorise small, unimportant details (with no true link to our target) of the training set. [...]

By | 2017-10-30T12:54:46+00:00 October 30, 2017|Categories: Data Analysis, Machine learning, Uncategorized|0 Comments

Let it roam free ! Releasing your code into the wild…

Today, I thought I'd do something a little different and talk about what one might expect from publicly releasing some code. I figured it might be nice to interview someone from our group which has lots of experience doing so, Tariq Daouda, to gain some of his insights. So without further ado, here we go ! JP: Hi Tariq, glad to have you with us. I thought I might ask you a few questions regarding what happens when one decides [...]

By | 2017-10-16T15:59:03+00:00 October 16, 2017|Categories: Computer science|Tags: , |0 Comments

A Week of Deep Learning

From August 21 to 25, IVADO and the MILA held their first edition of the École d'été francophone en apprentissage profond. The aim of this summer school was to "give [the participants] the theoretical and practical basis for understanding [deep learning]". A few members of the platform and myself participated to these five days of training. I must be honest, I was a little afraid of deep learning the first time it was presented to me. I found the concept [...]

By | 2017-09-22T13:46:35+00:00 September 22, 2017|Categories: Computer science|0 Comments