Overfitting and Regularization

This series of articles on machine learning wouldn't be complete without dipping our toes in overfitting and regularization. Overfitting The Achille's heel of machine learning is overfitting. As machine learning techniques get more and more powerful (large number of parameters), exposure to overfitting increases. In the context of an overfit, the model violates Occam's razor's principle by generating a model so complex that it begins to memorise small, unimportant details (with no true link to our target) of the training set. [...]

By | 2017-10-30T12:54:46+00:00 October 30, 2017|Categories: Data Analysis, Machine learning, Uncategorized|0 Comments

Gradient Descent

Gradient descent is an iterative algorithm that aims to find values for the parameters of a function of interest which minimizes the output of a cost function with respect to a given dataset. Gradient descent is often used in machine learning to quickly find an approximative solution to complex, multi-variable problems. In my last article, Introduction to Linear Regression, I mentioned gradient descent as a possible solution to simple linear regression. While there exists an optimal analytical solution to simple [...]

By | 2017-08-03T16:23:44+00:00 August 3, 2017|Categories: Data Analysis, Machine learning, Python, Uncategorized|0 Comments

ggplot2 101 : Easy Visualization for Easier Analysis

Biological data are often easier to interpret and analyse when we can visualize them via a plot format. A good way of doing so is by exploiting the different options of ggplot2, a R plotting system. In the following post, I will present some of my go-to tricks to visualize data: nothing to fancy or to hard, perfect for both the R masters and the R beginners! The sample codes are in R and the ggplot2 library must be installed [...]

By | 2017-05-19T15:08:52+00:00 May 18, 2017|Categories: Data Analysis, Data Visualization, R, Uncategorized|0 Comments