# Biology

## Think like a computer

Let's say all your results for a given project are stored in Excel files named exp1.xlsx, exp2_20170708.xlsx, exp_prolif_072017.xlsx... Inside file exp1.xlsx, you have this : This might be a user-friendly result file but it is not "computer-friendly" file. Let's suppose that you (or your boss) decide that you now need a database instead of the twenty-six different Excel files you have been using to store results. If all your files are similar to exp1.xlsx, you will have to put a [...]

By | 2018-02-08T13:32:14+00:00 February 8, 2018|Categories: Bioinformatics, Biology|1 Comment

## Logistic regression and GTEx

Working with all sorts of data, it happens sometimes that we want to predict the value of a variable which is not numerical. For those cases, a logistic regression is appropriate. It is similar to a linear regression except that it deals with the fact that the dependent variable is categorical. Here is the formula for the linear regression, where we want to estimate the parameters beta (coefficients) that fit best our data : Y_i = \beta_0 + \beta_1 X_i [...]

By | 2017-04-29T17:44:14+00:00 January 27, 2017|Categories: Biology, Data Analysis, Python||0 Comments

## Tweaking Fisher’s exact test for biology

Fisher's exact test is widely applied in bioinformatics (it is the core computation in gene-set or pathway enrichment analysis).  I won't introduce the test itself as others have done it several times (here), but will rather point to a disconnect between what it does and what is often needed. In Fisher's exact test, the null hypothesis is that there is no enrichment between the two variables studied.  When using this test with large numbers (such as the number of genes [...]

By | 2017-05-01T10:33:14+00:00 December 8, 2014|Categories: Bioinformatics, Biology, Statistics|Tags: |0 Comments

## Gene symbols : the challenge

Almost certainly, one day, you'll have between your hands a list of outdated gene symbols. And you'll probably think that updating them is a straightforward task, but it's not that simple! Because there's the word 'bio' in bioinformatician, updating the gene symbols reminds me of the futile cycle. According to Wikipedia's definition, a futile cycle occurs when two metabolic pathways run simultaneously in opposite directions and have no overall effect other than to dissipate energy in the form of heat**.  Updating the [...]

By | 2016-11-08T09:30:17+00:00 September 29, 2014|Categories: Bioinformatics, Biology|0 Comments