Data Analysis

Create a nice looking table using R

Hi everyone, Today I will introduce formattable. This package is designed for applying formatting on vectors and data frames to make data presentation easier, richer, more flexible and hopefully convey more information. We will see how to use this package to interpret your data at a glance, with just a few lines of code (You can follow along below as well as check all the code in my git). Before going further, I will specify that this package is generally used [...]

Introduction to Linear Regression

A data scientist's first goal is to find underlying relations within the variables of a dataset. Several statistical and machine learning methods can be used to discover such relations. Once uncovered, this information can be applied to everyday problems. For example, in clinical medicine, a predictive model based on clinical data can help clinicians guide a patient's treatment by offering insights that might not have otherwise been taken into account. Simple linear regression One of the most basic methods available to [...]

Chemical Screen: Evaluating drug sensitivity

The study of the cellular response to a chemical compound is crucial to the development of new therapeutic drugs. Such an analysis is usually done by a screen experiment where the disease-specific cells (such as leukemia primary cells) are exposed to chemical compound for different concentrations. The response, in the form of sensitivity, of these cells is conventionally quantified by the IC50 or the l’EC50. Here are some notions to keep in mind when we analyze these values.  IC50/EC50 : estimate of [...]

By | 2017-02-13T11:16:27+00:00 February 13, 2017|Categories: Data Analysis, Uncategorized|0 Comments

Logistic regression and GTEx

Working with all sorts of data, it happens sometimes that we want to predict the value of a variable which is not numerical. For those cases, a logistic regression is appropriate. It is similar to a linear regression except that it deals with the fact that the dependent variable is categorical. Here is the formula for the linear regression, where we want to estimate the parameters beta (coefficients) that fit best our data : \begin{equation} Y_i = \beta_0 + \beta_1 X_i [...]

By | 2017-01-30T18:47:48+00:00 January 27, 2017|Categories: Bioinformatics, Data Analysis, Python|0 Comments

Introduction to cowplot to combine several plots in one with R

Hi everyone, Today I will introduce cowplot, an extension of ggplot2 library. Some helpful extensions and modifications to the 'ggplot2' package. In particular, this package makes it easy to combine multiple 'ggplot2' plots into one and label them with letters, e.g. A, B, C, etc., as is often required for scientific publications. As you can see, this library can be useful to easily create a figure containing multiple plots. But we will see how we can use it to create [...]

By | 2016-11-28T11:39:47+00:00 November 28, 2016|Categories: Bioinformatics, Biology, Data Analysis, Data Visualisation, R|0 Comments