boucherg

About Geneviève

I’ve started in biochemistry but it is as a bioinformatician that I’ve been having fun for several years now : whether doing data analysis and visualization in R, building interactive web interfaces in javascript or exploring machine learning in python.

Kaplan-Meier plot

When working with cancer datasets, one of the goal is sometimes to find features (mutation, clinical information, gene expression, ...) associated to prognosis, i.e. features related to the probable outcome of the disease. If that's one of your goal, you'll have to do a survival analysis.  Survival analysis involves a set of methods to model the time at which an event of interest occurs, that event often being death.  But really, any event for which the time of occurence is [...]

By | 2017-04-29T17:14:26+00:00 February 19, 2015|Categories: Data Analysis, Statistics|Tags: |0 Comments

One task, three ways

Usually, there is more than one way to accomplish a task. Some are better, some are worse and others are just as good. Assessing which one to use is often related to the computing time, the ease of use and/or to personal preferences and abilities. Say I have a matrix of thousands of chromosomal features with the following column names : Feature, Start, End. All the positions are found on the same chromosome and the widths of my features are variable. [...]

By | 2017-05-01T10:25:02+00:00 January 15, 2015|Categories: Bioinformatics, R|0 Comments

Best practices in data visualization

Sébastien's last post presented a hard-to-understand graph. The Venn diagram with four sets is a good example of visualization gone wrong. Good practices in data visualization is a hot topic right now. Not just in science, but in multiple areas such as journalism and business intelligence. Indeed, the crowd was quite heterogeneous at the first Visualisation Montréal meeting in August where more than 100 persons showed up! And the free ebook that was launched at the meeting targets beginners from all fields. [...]

By | 2017-04-29T15:40:51+00:00 October 31, 2014|Categories: Data Visualization|0 Comments

Gene symbols : the challenge

Almost certainly, one day, you'll have between your hands a list of outdated gene symbols. And you'll probably think that updating them is a straightforward task, but it's not that simple! Because there's the word 'bio' in bioinformatician, updating the gene symbols reminds me of the futile cycle. According to Wikipedia's definition, a futile cycle occurs when two metabolic pathways run simultaneously in opposite directions and have no overall effect other than to dissipate energy in the form of heat**.  Updating the [...]

By | 2016-11-08T09:30:17+00:00 September 29, 2014|Categories: Bioinformatics, Biology|0 Comments

RStudio and version control

A version control is just a way to keep track of changes made to files throughout time.  It allows you to return to previous versions later.  I bet you are already using one without even knowing it! When you copy a file or a script before modifying it, you're using version control.  However, your manual version control may become hard to deal with at some point.  That's why it's worth investing time early on in a project and use a [...]

By | 2017-05-01T10:21:46+00:00 June 10, 2014|Categories: R|Tags: , , |0 Comments