sandbox.bio provides interactive tutorials for exploring bioinformatics command-line tools within a web browser.
When people ask “what is bioinformatics” I have to pause. When I named a computer “bioinformatics” at the dawn of the Internet Age (1994) it made sense to me base on the French language: “informatique” is a French word for “the science of information” which could be “Computer Science” in English. The suffix “bio” simply meant that the methods were applied to Biology. A good review of the genesis of this word can be found in the “informatics” article on Wikipedia, citing the German Informatik in 1957, French informatique in 1962. The word apparently existed in the Anglo-Saxon language World as well, some citing a Dutch paper from 1970 taking credit for the addition of “bio”. The article also details the different meanings of the word informatics depending on the countries.
Three decades later many definitions can be found from Search Engines. This definition from yourgenome.org provides an overall “big picture” that is satisfying.
Bioinformatics is the science of both storing lots of complex biological data, and of analysing it to find new insights, which we use in many different ways.
The computers, computer methods, programming languages have changed over time. We now have GPUs in addition to CPUs; Python, R, and C or C++ and many others instead of FORTRAN as scientific programming languages. But the endeavor is still to understand Biology with the help of computers and algorithms.
Bioinformatics for the terrified
The European Bioinformatics Institute (EBI) in the UK has a short informational self-paced mini course online with the “cool” title: Bioinformatics for the terrified: An introduction to the science of bioinformatics which is a short series of “descriptive” uses without actual computing.
Bioinformatics tutorials at sandbox.io
One source of actual hands-on tutorials is the web site sandbox.bio that provides interactive tutorials for exploring bioinformatics command-line tools within a web browser. Command line is essential for conducting large scale analyzes as commands can be gathered into scripts which can be applied to a large set of data.
This web site helps getting familiar with many such tools with short, interactive mini-tutorials that the user can follow self-paced. Each is short enough to have the satisfaction to follow through, and opens the door to understanding the purpose of each of the tools proposed.
Some are “general use” as they can help manipulate data format for example, and are “old-time” Unix/Linux shell commands (
grep.) Other tools are more specific to a branch of research, for example tools for “Next Generation DNA Sequencing” (
bedtools.) There is a choice of beginner (e.g. Terminal Basics,) and intermediate tutorials (e.g. Data exploration with awk.
A sandbox is a test environment that allows users to execute programs or open files without impacting the application, system, or platform on which they are running. This is usually an “isolated” environment, therefore not able to affect other software or systems on the computer where the test is run.
In the context of sandbox.io the advantage is that the user can test the proposed software explorations without the need to install anything on their own computer.