Wednesday, 8 January 2014

My Formula as a Bioinformatician

Every day, I enjoy reading about bioinformatics in blogs, linkedin, and twitter; away from my daily reading of manuscripts journals. I strongly think that the future of publications/science will be closer & closer to the open access style and this emergent way to publish your ideas faster/brief in your own space. Some of my old co-workers don't understand this way to get in touch with science using informal environments rather than arbitrary/supervised spaces; I just said to them, we make the future, not the past. Reading the popular post “A guide for the lonely bioinformatician”, I was thinking about the last three years and how I have been built my own formula to survive as a lonely bioinformatician in a small country, with a lousy internet connection and without a bioinformatics environment.        

All the bioinformaticians that I met during these three years can be categorized in three major groups considering their original background:

1)    MDs, Biologist, Biochemist, Chemist
2)    Physicist, Mathematicians, Computer Scientist, Software Engineers, Software
3)    Philosophers, *

As an embryonic and growing field the diversity is huge, then it is quite complex to express all the data behavior in one model or a formula. Here I will summarize some of the variables of my formula, extremely correlated with the original post suggestions:

1. Define yourself as The Bioinformatician

When I started to solve (mainly statistics and data handler) problems in my lab, all the PhD students and PostDocs addressed to me as the guy in charge of computational stuffs, the nerd of the team who can give you some statistics about your data quickly & nicely. In worst cases, I never hear again about the research conclusions or its future steps. After months, I decided to rename myself as the bioinformatician of the team and this step can be trivial but still important, you are not the Sys Admin, you are not the Developer, you are not the biologist with computational background, you are not the computer guy, you are not the excel guru, you are the BIOINFORMATICIAN
2. Learn as mush as possible from your lab questions

You can find two basic and orthogonal opinions: (i) those researchers who think is important to spend time in labs dealing with labs problems and the origin of the data & (ii) those researchers who think is not necessary at all. From my experience you should learn as much as possible from your lab without spending time in it. What I do: I participate from my lab discussions, invite lab co-workers for a coffee break and discuss and listen to their labs problems and the possible solutions; then I go back to my PC and read about it. As a bioinformatician you should care about the data itself, not how to obtain a better and quality data. 

Second issue is that you should spend time learning about how to detect data problems and inconsistencies, errors and finally perform the bioinformatics analysis. However, I like to be part of my lab team and I enjoy our technical discussions. As a bioinformatician they (and you) expect to propose integral solutions to your lab rather than partial answers to individual problems and questions. This topic is strongly related to with the first and the last suggestion of this post. Also you should learn from other labs questions and it is a good practice to accept manuscript revisions; starting from scratch (no-high impact journals). 
3. Learn batch and a Programming Language

Perhaps your lab is Windows oriented, but you shouldn’t be. Even if your daily work is in Windows, I can guaranty you, you are part of a 0.1% of the bioinformatics community. Unix/Linux is the chosen environment for bioinformatics and its key component is the command line interface. The term ‘shell’, or ‘UNIX shell’, refers to a command line interpreter for the UNIX/Linux operating system. Microsoft provides a command line interface for Windows, but this is not commonly used in bioinformatics. During these years I have worked on cheminformatics, proteomics and some genomics; in any of those fields command line skills are essential to handler data, submit your jobs to distributed systems, process big files, and interact with databases and services on internet. About a programming language, I will write some lines about it soon; meanwhile you can take a look to this pool. The most important thing is that you should be good enough to generate your own scripts, programs and at the end tools.      

4. Make friends with other bioinformatics groups.

As the original post says: “Develop electronic relationships with people and groups on the Internet. Develop a support group who will be able to help you with the kind of problems your lab-based group cannot”. This advice is crucial to grow as a bioinformatician or scientist in general and we are in a better position compare to other experimental fields. I normally use twitter, linkedin and skype to get in touch with my online co-workers and talk about data, programs and share ideas. I have more than 5 research manuscript with people that I have never met personally, but we have spent hours, sharing ideas, data and code. I posted in this Blog different lists with possible contacts in the field of Computational Proteomics and Bioinformatics.    

5. Develop your own research

If you arrive to this advice and use least 10% of the previous ones you will be able to start your own research. If you learn from the basic questions of your lab and you think about possible and complete solutions you are ready to generate your own questions, tools and studies. If you have the ideas, have also the tools to implement those ideas and have some contacts to learn and go faster in the right directions you will be able to conduct and develop your own research. If some of these ideas can be converted into manuscripts, it is a good practice to publish your own research apart from your common contributions to other papers in your lab.

From my point of view you can’t find the perfect model or formula, but you can learn as I did from different experiences and careers. I’m still learning.

Some References: 
  1 - So you want to be a computational biologist?
  2 - A guide for the lonely bioinformatician  
  3 - Collection of published “guides” for bioinformaticians