Bioinformatics

Beginner R: functions that make your life easier

Let’s get to know my top 10 R’s neat little functions and tricks that make our life easier when manipulating data in R. Sequences Want to make long sequences of numbers or letters but don’t feel like writing them all out into a vector? R let’s you make a sequence with “:” for numbers. You can also use seq() if you are looking for a regular sequence that is not incremented by one. letters[] let’s you make continuous letter sequences, [...]

By | 2016-11-08T09:30:07+00:00 January 28, 2016|Categories: Bioinformatics, R|0 Comments

Generating Synthetic Genomic Data

Applying statistical methods is a large part of the work of a bioinformatician. Apart from some more classical techniques, machine learning algorithms are also regularly applied to clinical and biological data (notably, clustering techniques such as k-means). Some techniques such as artificial neural networks have recently found great success in areas such as image recognition and natural language processing. However, these techniques do not perform as well on small datasets with high dimensionality, a problem known as "the curse of dimensionality". [...]

By | 2016-11-08T09:30:08+00:00 January 7, 2016|Categories: Bioinformatics, Data Analysis, Python|0 Comments

Grep parameters every bioinformatician should know

Your shell, along with the myriad command line programs it exposes is clearly a great friend when it comes to file manipulation. And let's face it, file manipulation is a big part of a bioinformatician's daily workload. Now, since we rarely have the time to review all the options offered by the different programs I thought I'd list some really useful ones from grep. I expect everyone to know what grep is and what it does so let's just get [...]

By | 2016-02-09T14:20:48+00:00 November 27, 2015|Categories: Bioinformatics, Data Analysis, Shell scripting|Tags: , |0 Comments

[Python] Iterators vs Generators

In Python, there are iterators and generators. You probably already use iterators without even knowing that you do so. But understanding the difference between those two concepts is really important since choosing one over the other has a huge impact on memory usage. If you are working with small datasets, memory usage might not be your first concern. However, with big datasets, it is another story. So what are they exactly, iterators and generators? Iterators The process of going through [...]

By | 2015-09-18T09:36:54+00:00 September 18, 2015|Categories: Bioinformatics, Performance, Python, Uncategorized|0 Comments

Draw me a Circos

How pretty would that look in my article? Very Pretty! As well as being informative! You might want to use a Circos for your own personal analysis or as an article figure. In both cases, this kind of representation is useful when it comes to visualizing data in a more global or complete manner:  you can have multiple types of data ranging across various chromosomal sequences. However, as wonderful and exciting the idea of having your own personal Circos might [...]

By | 2016-11-08T09:30:10+00:00 August 20, 2015|Categories: Bioinformatics, Biology, Data Visualisation|0 Comments