Kaplan-Meier plot

When working with cancer datasets, one of the goal is sometimes to find features (mutation, clinical information, gene expression, ...) associated to prognosis, i.e. features related to the probable outcome of the disease. If that's one of your goal, you'll have to do a survival analysis.  Survival analysis involves a set of methods to model the time at which an event of interest occurs, that event often being death.  But really, any event for which the time of occurence is [...]

By |2017-04-29T17:14:26+00:00February 19, 2015|Categories: Data Analysis, Statistics|Tags: |0 Comments

Table-reading: loading data into R without a hassle

The first thing I have learned in R is how to load a table. Usually, when you start your R journey, someone more knowledgeable will tell you how to do this very first action. It will typically be: data<-read.table("~/SomeFolder/datafile.txt") You probably will be adding various parameters into the brackets such as "row.names=0" or "header=TRUE" or, "sep="\t"", to make sure you are reading your file correctly. And this is perfectly fine, as a loading method of small datasets. However, to maximize [...]

By |2017-04-29T17:14:58+00:00February 5, 2015|Categories: Bioinformatics, R|Tags: |1 Comment

Client-side storage on the web

Web applications can provide users with cross platform tools which can easily be maintained and updated. It is therefore little wonder why bioinformatic tools are often published as web applications. However, some legal as well as computer security considerations can arise while operating on certain types of data  (e.g. medical or proprietary). In such cases, it may be preferable to store some of this data locally on the client's browser. Local data storage options are plentiful but can quickly become a little disorientating. Here's a small rundown [...]

By |2017-04-29T15:47:47+00:00January 28, 2015|Categories: Computer science|Tags: |0 Comments

One task, three ways

Usually, there is more than one way to accomplish a task. Some are better, some are worse and others are just as good. Assessing which one to use is often related to the computing time, the ease of use and/or to personal preferences and abilities. Say I have a matrix of thousands of chromosomal features with the following column names : Feature, Start, End. All the positions are found on the same chromosome and the widths of my features are variable. [...]

By |2017-05-01T10:25:02+00:00January 15, 2015|Categories: Bioinformatics, R|0 Comments

Tweaking Fisher’s exact test for biology

Fisher's exact test is widely applied in bioinformatics (it is the core computation in gene-set or pathway enrichment analysis).  I won't introduce the test itself as others have done it several times (here), but will rather point to a disconnect between what it does and what is often needed. In Fisher's exact test, the null hypothesis is that there is no enrichment between the two variables studied.  When using this test with large numbers (such as the number of genes [...]

By |2017-05-01T10:33:14+00:00December 8, 2014|Categories: Bioinformatics, Biology, Statistics|Tags: |0 Comments
Go to Top