Shell scripting

Realize your Bash potential

A bioinformatician's best tool is his shell. While some have already mastered the dark arts of the bash shell, I often see beginners (and even catch myself at times!) unknowingly repeating key sequences when they could be getting the same result with a few simple built-in keybindings or programmatic shortcuts. Let's have a look at some of the most useful bash shortcuts that no self-respecting bioinformatician should be without. This is by no means an exhaustive list of what Bash has to offer but will hopefully serve to save [...]

By | 2017-04-29T22:57:32+00:00 May 26, 2016|Categories: Computer science, Shell scripting|0 Comments

Grep parameters every bioinformatician should know

Your shell, along with the myriad command line programs it exposes is clearly a great friend when it comes to file manipulation. And let's face it, file manipulation is a big part of a bioinformatician's daily workload. Now, since we rarely have the time to review all the options offered by the different programs I thought I'd list some really useful ones from grep. I expect everyone to know what grep is and what it does so let's just get [...]

By | 2017-04-29T15:35:48+00:00 November 27, 2015|Categories: Data Analysis, Shell scripting|Tags: , |0 Comments

Working with large files

When dealing with Next Generation Sequencing data, I am routinely asked by clients how to open sequence files. The answer is that given their huge size (often many million lines) and the consequent requirement in memory, they should probably not be opened in any way, they should only be processed. Most software designed to work with NGS data will then process these files in a sequential fashion or stream, loading just the required amount of data from disk, processing it [...]

By | 2017-04-30T10:19:35+00:00 October 1, 2015|Categories: Data Analysis, Shell scripting|Tags: , |2 Comments