Saturday, 29 August 2015

DIA-Umpire Pipeline Using BioDocker containers.


The complexity of some bioinformatic softwares is well-known and it has been commented in different papers and blog posts, etc. Especially, those softwares that depend of many software components and tools making impossible for a testing/new-user try for the first time the software. @BioDocker aim to simplify the process of testing/compiling/deploying bioinfo softwares. Our previously post shows how to use the TPP software from System Biology team

Recently, the Data Independent Acquisition Methods has been receiving a lot of attention by the proteomics community, specially SWATH. In this example We are going to demonstrate the importance of Docker through the use of a complex and powerful pipeline called DIA-Umpire. In this example I will demonstrate how to download, run and obtain the results from the DIA-Umpire pipeline.

Monday, 24 August 2015

Moving Bioinformatics to the Cloud

Constantly we presence new technologies being developed and streamed to public, to researchers that work with molecular biology, the ones that get our attention normally comes from new laboratory methodologies or instruments. In this post, we are going to talk about a different situation that is calling the attention of researchers who work with molecular biology, and more specifically, bioinformatics, in a different way. I’m writing about a technological innovation that comes from the computational field and can have a great impact on how we do biological analysis with bioinformatics software.


A few years ago a cloud startup called dotCloud developed a new software called Docker, to be used only internally, the software made so much success that just after two years releasing docker to the public, the newly Docker company has an estimate worth of $1 bn.


What is Docker and Why it Matters?


Docker has several ways to be employed in different environments, what it does is to basically, provide to the user isolated and containerized software that can be executed apart from the host operating system. It is very similar to what a Virtual Machine does, the difference is that there is no guest operating system. These containers use some system libraries and apply some abstraction layers to the execution of the software inside, in the end you have an isolated environment with a custom software inside that can be shared.


What this has to do with Bioinformatics?


Imagine that you are a senior researcher, or even a recently accepted student, trying to learn how to do some analysis. You are a lab specialist but computers are not your thing. Now imagine that the software you are trying to run needs a Linux operating system with a gcc compiler version 4.9.3 and some libraries like GD. Sounds bad right? That’s where Docker comes in. Docker allows developers to ship software inside a container, that is, a custom environment with all the necessary tools and configuration to run a specific program, what you have to do if just download the container and execute the program inside.  Running a Docker container is just as simple as running a program in the command line.


Benefits for Bioinformatics


For a bioinformatician this brings several other benefits. Something that is getting attention today is how to deal with reproducible research in the bioinformatics field. Different computers with different configurations, libraries and software versions can produce different results when comparing results from different software. If we had the chance to transform the environment variable into a constant, that problem would be reduced a lot.


The BioDocker Project


In 2014, a new project called BioDocker was founded. Recently, the project assumed a community-driven policy, the main idea is to get feedback from the community and to enjoy the specialty of each member. The goal here is to provide containerized bioinformatics tools to the general public. For developers bioinformaticians, the project also provides specifications, settings and guidelines on how to produce your own Biodocker containers. Defining guidelines like that we hope that the use of Docker become more common, helping people to deal more easily with different software and to reduce the problem with the reproducible research.


Wrapping up


Docker is a new technology that is gaining a lot of space nowadays, and slowly , it is getting some space in the bioinformatics field as well. It is definitively worth to get some time to learn how to work with it.

Thursday, 13 August 2015

The future of Proteomics: The Consensus


After the Big Nature papers about the Human Proteome [1][2] the proteomics community has been divided by the same well-known topics than genomics had before: same reasons, same discussions [3-7]. No one discusses about the technical issues, the instrument settings, nothing about the samples processing, even anything about the analytical method (Most of both projects are "common" bottom-up experiments). Main issues are data-analysis problems and still Computational Proteomics Challenges.  

Monday, 27 July 2015

one big lesson I just learn

I'm coming from small country with no resources, no big industries or capitals (Cuba); but with a big tradition in friendship and solidarity. In my previous institute (surprisingly, a big biotech company) we share openly all of our ideas, we discuss openly our results, thoughts, etc.. without thinking in competition, plagiarism, or someone from collaborator group can take your ideas and results to sell them to others or take them as his owns ideas. 

The picture completely changed, after one year abroad, the only big think I learned is that outside my farm and my small country: time, ideas, contacts are gold.  In science you have people with you can work and collaborate, because they are open by nature (not only because they source code is in github) but also because they share, they help, they support, and they give their ideas without concern. People, that like to talk about science, they encourage young researchers without fear of others, without fear of being open. 

But you have other people, people that always looks for competition, they stealing what is not theirs, looking for ideas to be recognised, looking for contacts, looking for papers, to get citations. The good thing is that I learn, and I can recognise them. I can give them my ideas, my time, because they need it more than me. At the end, the friendly ones, the collaborative ones, the ones that share, open, help, support; we are more and not only the ones that have their code in github.        

Monday, 8 June 2015

Sunday, 31 May 2015

I love technical notes and short manuscripts

One of my first papers in 2012 (here), was related with support vector (SVM) machines. It was a simple algorithm, that improved the method to compute the isoelectric point of peptides using SVM. The first time I presented the results to my colleagues, one of them ask me: "are you planning to publish this?". One of the senior co-authors said, "we can write a big research manuscript, explaining other algorithms, compare them, use other datasets, etc". Another said (computer scientist), "we can explore other features from peptides including topological indexes.. and write a full research manuscript about.."....
"I was very clear from the very beginning, We will write a Technical Note or Letter. "