Logistic regression and GTEx

Working with all sorts of data, it happens sometimes that we want to predict the value of a variable which is not numerical. For those cases, a logistic regression is appropriate. It is similar to a linear regression except that it deals with the fact that the dependent variable is categorical. Here is the formula for the linear regression, where we want to estimate the parameters beta (coefficients) that fit best our data : \begin{equation} Y_i = \beta_0 + \beta_1 X_i [...]

By | 2017-04-29T17:44:14+00:00 January 27, 2017|Categories: Biology, Data Analysis, Python|Tags: , , |0 Comments

A javascript implementation of the non-central version of Fisher’s exact test

In a previous post, I presented a case for choosing a non-central version of Fisher's exact test for most of bioinformatics' uses of this test. I will now present an implementation of this test in javascript that could easily be embedded in web interfaces. Although javascript is probably the least likely language to implement statistical methods, I hope this article will fill in as many details as possible to make it trivial to port it to other languages if the need arises. At [...]

By | 2017-04-29T17:47:57+00:00 January 13, 2017|Categories: Data Analysis|Tags: , , |0 Comments

SNP Filtering with pyGeno

Looking over the contents of our growing blog (good job guys !), it occured to me that we had not yet posted an article pertaining to the fantastic (and homegrown !) bioinformatics resource that is pyGeno. It turns out I need to use pyGeno to generate data and it's also my turn to write a blog post, how convenient ! I'll focus the article on writing a SNP filter, which can be a bit surprising the first time you try [...]

By | 2017-04-29T17:57:51+00:00 December 9, 2016|Categories: Bioinformatics, Python|Tags: , |0 Comments

Introduction to cowplot to combine several plots in one with R

Hi everyone, Today I will introduce cowplot, an extension of ggplot2 library. Some helpful extensions and modifications to the 'ggplot2' package. In particular, this package makes it easy to combine multiple 'ggplot2' plots into one and label them with letters, e.g. A, B, C, etc., as is often required for scientific publications. As you can see, this library can be useful to easily create a figure containing multiple plots. But we will see how we can use it to create [...]

By | 2017-04-29T16:22:55+00:00 November 28, 2016|Categories: Data Visualization, R|0 Comments

Pivoting tables : from long to wide

As bioinformaticians, we often have to work with data that are not formatted the way we would need them to be. One case we might encounter is receiving data in a "long" format instead of receiving them in a more familiar "wide" format. For those of you familiar with the ggplot R package, you know this format very well. It's the format required by ggplot to produce its nice graphs.   Long genes samples expression 1 BAD S01 7.525395 2 [...]

By | 2017-04-29T18:11:56+00:00 November 14, 2016|Categories: Data Analysis, Python, R|Tags: |0 Comments