ggplot2 101 : Easy Visualization for Easier Analysis

Biological data are often easier to interpret and analyse when we can visualize them via a plot format. A good way of doing so is by exploiting the different options of ggplot2, a R plotting system. In the following post, I will present some of my go-to tricks to visualize data: nothing to fancy or to hard, perfect for both the R masters and the R beginners! The sample codes are in R and the ggplot2 library must be installed [...]

By | 2017-05-19T15:08:52+00:00 May 18, 2017|Categories: Data Analysis, Data Visualization, R, Uncategorized|0 Comments

Let Your Data Flow: Streams and Reactive Programming

What's all this about ? ReactiveX is a combination of the best ideas from the Observer pattern, the Iterator pattern, and functional programming. Using Rx, you can easily: - Create event or data emitting streams from sources such as a file or a web service - Compose and transform streams with query-like operators - Subscribe to any observable stream and "react" to its emissions to perform side effects Reactive programming has been gaining traction these past few years. Maybe you've [...]

By | 2017-05-03T09:19:14+00:00 May 2, 2017|Categories: Bioinformatics, Computer science, Data Analysis|Tags: , |2 Comments

Big data, big challenge

You've probably heard the expression "Big Data" before. Particularly, if you read Simon Mathien's blog post on IRIC's website. (If you haven't read it yet, you should do it now!). There exist several definitions (or interpretations) of this expression, which is best summarized by the following two : Data of a very large size, typically to the extent that its manipulation and management present significant logistical challenges; (also) the branch of computing involving such data Oxford English Dictionary Domaine technologique dédié [...]

By | 2017-05-02T21:05:43+00:00 April 24, 2017|Categories: Data Analysis|Tags: , , |3 Comments

Create a nice looking table using R

Hi everyone, Today I will introduce formattable. This package is designed for applying formatting on vectors and data frames to make data presentation easier, richer, more flexible and hopefully convey more information. We will see how to use this package to interpret your data at a glance, with just a few lines of code (You can follow along below as well as check all the code in my git). Before going further, I will specify that this package is generally used [...]

By | 2017-04-29T16:21:40+00:00 March 30, 2017|Categories: Data Visualization, R|Tags: , |0 Comments

Introduction to Linear Regression

A data scientist's first goal is to find underlying relations within the variables of a dataset. Several statistical and machine learning methods can be used to discover such relations. Once uncovered, this information can be applied to everyday problems. For example, in clinical medicine, a predictive model based on clinical data can help clinicians guide a patient's treatment by offering insights that might not have otherwise been taken into account. Simple linear regression One of the most basic methods available to [...]