Wednesday, 6 March 2013

#HavanaBioinfo2012 Hard, but Awesome Experience

From 8th to 11th of last December the "I Bioinformatics for Biotechnology Applications" (#HavanaBioinfo2012) was held in the hotel “Occidental Miramar” located within an elegant area in Havana, Cuba. Putting on a Bioinformatic workshop takes a lot of different pieces (small ones and big ones). You need write lot of mails to invited speakers, you need a nice and comfortable place with lots of tables and chairs and good food and the drinks. 

I started from October, writing the first mails to EBI friends, advisers and professors (, and to be complete honest it was fantastic because most if then accepted from the very beginning. Thanks to Henning Hermjakob and Alex Bateman (@Alexbateman1) for the support. At the end the workshop was fully subscribed, with more than 45 attendees and sixteen speakers, participating in two poster sessions and a panel discussion. Speakers from EBI, Belgium University, Mascot (UK), Bioinformatics Solutions (Canada) accepted the invitation to come here and give one or two lectures about proteomics, genomics, bioinformatics.

Three of the more commonly used MS/MS search engine attended to the conference Mascot (David Creasy - Matrix Science), MaxQuant - Andromeda (Jürgen Cox Cox - Mann Lab) and Peaks (Paul Shan - Bioinformatics Solutions). 

The workshop was organized in three sessions dedicated to “System biology resources”, “Protein identification and quantitation” and “Molecular drug design”. On the first day, I gave a brief introduction and the welcome words to the invited speakers and the students and Dr. Gerardo Guillen (research director at CIGB) described the history and current developments of Cuban. All invited speakers were surprised about the impact of the Cuban products on the health care system and the current product pipeline of CIGB, particularly the Heberprot-P results. 

Henning Hermjakob (EBI) explained the EBI resources from Molecular Interactions (IntAct) via curated human pathways (Reactome) to Systems Biology Models (BioModels). Particularly, the Proteomics Standard Initiative (PSI) Common Query Interface (PSICQUIC) motivated an interesting discussion about remote access resources. Finalizing the system biology topic, Henning Hermjakob and Marco Punta described in detail the UniProt and PFam resources.

Baozhen Paul Shan (Bioinformatics Solutions Inc.) described the history, theory, and practice of de novo identification strategy. The speaker demonstrated the actual scoring algorithm in PEAKS, and explained the fundamentals without losing the non-mathematicians in the audience. Continuing the de novo theme, Felipe Leprevost (Fiocruz, Brazil) explained the PepExplorer application, an integrated system to organize and statistically filter de novo sequencing results. The integration in one workflow, using the database search strategy and the de novo algorithm pepNovo, increases the number of peptides and proteins identified. 

In the afternoon, Lennart Martens (Ghent University and VIB) talked about the “CompOmics toolsuite”. During the last twelve years Lennart's group has developed a broad set of Java tools for proteomics data analysis. The source code, documentation and a complete set of examples for the main code library are freely available at Closing the first day, Klemens Vierlinger (Health & Environment Department/AIT/Vienna, Austria) described the current challenges in meta-analysis and data integration in biomarker discovery, especially in human fibrotic disease. After the afternoon coffee break, the poster session included the discussion of eleven posters by students from Cuba, Mexico and Colombia.

The second day was dedicated entirely to protein identification strategies and tools. The possibility to interact and discuss with David Creasy (Matrix Science, Mascot search engine) and Jürgen Cox (Max Planck Institute, Martinsried, MaxQuant-Andromeda software) about the scoring systems and platform fundamentals ensured a productive session. 

David Creasy (Matrix Science) described the history, theory, and practice of Mascot search engine and tools. David pointed out some of the parameters in Mascot that may cause problems if not 200 properly employed. For example, doing a non-enzyme search in Mascot is not a good idea unless there is a very high level of non-specific peptides expected in the sample. Semi-trypsin is almost always a better choice if the peptides came from a tryptic digest. David also explained that one of the future very promising fields is the inclusion of spectral library search in the current proteomic workflows, as is already available through SpectraST or X!Hunter.

The ensuing coffee break was particularly motivated by Mascot discussions, some of the non-answered questions were: Why is Mascot successful and extensively used even with the existence of different freely available tools such as X!Tandem, OMSSA and Andromeda?; How can the Mascot scoring system be at the same time powerful yet simple?; Why don't popular search engines consider the intensity of the signals in the scoring systems? The organizers decided to give an additional 10 min of coffee break time just to boost this dynamic and enthusiastic discussion environment.

Jürgen Cox (Max-Planck Institute for Biochemistry, Munich, Germany) introduced the MaxQuant platformfor high-resolution mass spectrometry experiments. Recent revolutionary advances in high accuracy mass spectrometry-based proteomics are providing a new basis for data-driven systems biology. Jürgen described the algorithms and whole workflows encompassing the mass spectrometry data analysis from intelligent data-driven acquisition, via algorithms for identification and quantification of proteins, to the statistical analysis of the final expression data for proteins and posttranslational modifications in the context of other omics and pathway data.

Before lunchtime, Henning Hermjakob described the current status of the proteomics repository services in the European Bioinformatics Institute. The PRotein IDEntification Database started in 2005 and in the last update contains 11,629,064 identifications and 338,501,793 spectra, supporting the most common spectrum and identification file formats. 

The last day was entirely dedicated to molecular drug design and chemoinformatics. The opening lecture entitled “Rational design of peptide inhibitors against Dengue virus” was given by Glay Chinea (CIGB). An overview regarding the Dengue virus, its prevalence and typical clinical outcomes was first introduced. Violeta Perez-Nueno from Orpailleur Team (INRIA Nancy) presented several approaches that can be used to model molecular interactions and more deeply a new 3D shape-based approach for predicting and quantifying drug promiscuity by correlating Gaussian clusters of ligand spherical harmonic shapes. The presentation entitled “Epitope-based vaccines — From high-throughput data to individualized therapies” by Oliver Kohlbacher triggered an enthusiastic exchange of ideas. Epitope-based vaccines (EVs) have recently been attracting growing interest. The success of an EV is determined by the choice of epitopes used as a basis. After lunch, a conference entitled “Challenges in homology search: HMMER3 and convergent evolution of coiled-coil regions” was given by Marco Punta. Sequence alignment programs may miss or mis-identify homologous relationships between proteins based on different factors, including homologous overextension and convergent evolution (as observed incompositionally biased amino acid regions). He presented a study where the Pfam collection of manually curated profile hidden Markov models is used to test the accuracy with which the alignment program HMMER3 assigns protein sequences to homologous families. 

During the Workshop we visited different historic and turism places in Havana and Pinar del Rio. Some pictures:

Pinar del Rio
Nice introduction about how cuban farmers make the cigars (Tobacco House)

Bodeguita del Medio

Interesting discussion (no about search engine performance) about old Havana architecture. 
Old American Car.. is our common taxi
Occidental Miramar Venue after dinner

Pinar del Rio