Twitter Facebook Youtube

Bioinformatics Unit


  • Hugo Naya (PhD, Head)
  • Martín Graña (PhD, Associated Researcher)
  • Natalia Rego (Technical Assistant, MSc student in Zoology)
  • Lucía Spangenberg (PhD)
  • María Inés Fariello (PhD, Biostatistician and population genomics)
  • Tamara Fernandez (PhD student in biology, Research assistant)
  • Gregorio Iraola (PhD student in biology, Research assistant)
  • Pablo Fresia (PhD)
  • Sebastián Valenzuela (MSc student in bioinformatics)
  • Daniela Megrian (MSc student in bioinformatics)
  • Ignacio Ferrés (MSc student in bioinformatics)
  • Daniela Costa (PhD student in biology)
  • Gonzalo Collazo (Undergraduated student)




  1. NGS and microarrays data analysis.
  2. Sequence alignment and phylogenetic inference software.
  3. Sequence analysis software.
  4. 3D molecular modeling software.
  5. Tools for complex systems analysis.
  6. Basic biostatistics and use of specific software advice.
  7. Software development.


In the past 20 years, the development of new technologies has led to amazing discoveries in biology. In particular, nano-technologies, automatization and computer science allowed a series of High-Throughput analysis in molecular and cell biology that completely changed the existent paradigm. However, these new instruments also changed unexpectedly the landscape of research conception. The promise of hypothesis-free data has conducted, in several cases, to careless experimental design that precluded full exploitation of results, increasing the experimental turnover and the storage of waste in data-repositories. Technology evolves extremely fast, but analytical methods aren’t automatized enough yet, leading to the well-known effect of “Next-Generation gap”. The gap is in expansion now (with the 2nd generation sequencing) and will be enormous with 3rd generation technologies. In fact, analysis teams simply can’t analyze exhaustively each dataset before a new dataset arrives, just scratching the surface and sending to the warehouse (or even garbage) tons of data.

In this context, any methodological effort towards better usage of data should be viewed as benefiting the scientific community. Our research, although diverse, is united by this underlying goal and combines the methodological strengths of bioinformatics, statistics, evolutionary genomics and quantitative genetics.

We recently proposed a method that identifies associations between amino acid changes in potentially significant sites in an alignment (taking into account several amino acid properties) with phenotypic data (Spangenberg et al., 2011), through the phylogenetic mixed model. The latter accounts for the dependency of the observations (organisms). It is known from previous studies that the pathogenic aspect of many organisms may be associated with a single or just few changes in amino acids which have a strong structural and/or functional impact on the protein. Discovering these sites is a big step towards understanding pathogenicity. Our method was able to discover such sites in proteins (RpoS) associated to the pathogenic character of a group of bacteria, highlighting several sites with significant differences in biological relevant regions. In addition, we developed a freely available R package named “bcool” ( In the near future, we think to apply this strategy to search for differences in biofilm related genes.

We also assessed the question of how bacteria cause pathogenicity in humans from other perspective. Our motivation was try to give integrative information about general genome-coded signatures that explains pathogenicity for all bacterial pathogens, and not restricted to particular taxa. In this case, we explained pathogenicity based on the hypothesis that it is caused by the presence of a reduced set of virulence-related genes. To do this, we explored the presence/absence patterns of virulence genes in all available genomes of pathogenic and non-pathogenic strains. Then, this information was used to build a Support Vector Machine model that, once trained, is capable of predicting if a new sequenced genome is a human pathogen or not. This model has an average accuracy of 95%, and to the best of our knowledge, is the statistical model with this purpose that achieves the highest accuracy reported so far. Moreover, our method can classify bacterial genomes independently of their taxonomic context, in contrast with other similar approaches that only take into account a certain part of bacterial diversity, being useful only to classify specific taxa. Our statistical learning approach is grounded on the biological meaning of the selected genes and supporting the fact that bacterial pathogenicity can be explained by the presence or absence of a set of specific genes that code for virulence determinants. Based on this, we developed “BacFier”, a freely available software that may be useful for practical purposes. Beyond the implementation of our model in a program, capable to accurately classify bacteria in human pathogens or non-pathogens, we determined and discussed the biological significance of the core set of genes that mostly explains the pathogenic phenotype in bacteria. Finally, we have shown which functional categories of virulence genes (i.e: toxins, motility proteins, etc.) were likely pathogenicity signatures within each taxonomic division (i.e: Actinobacteria, Gammaproteobacteria, Firmicutes, etc.), which seems to be a completely new kind of information and could lead to important evolutionary conclusions. Nowadays, we are working in enhancing model sensitivity and exploring the possibility of developing a multiclass classifier, that could predict pathogenicity in other hosts besides human, like cattle, plant or fish.

As part of our general interest in bacterial pathogenicity, we are involved in a more specific problem; the study of biofilms formation determinants in Leptospira. This genus includes animal and re-emerging human pathogens, as well as non-pathogenic strains. Despite its importance for human health and animal production, genetic features that determine pathogenic phenotypes in Leptospira proved to be elusive. Recently, biofilms formation capability has been suggested as a key factor in pathogenesis of leptospirosis but, as mentioned above, there is a lack of knowledge regarding its genetic basis. In this ground, we are implementing comparative genomics analysis to find orthologous genes with functions associated to biofilms formation. Moreover, in the near future we plan to perform transcriptome analysis that could give data regarding expression patterns of genes involved in biofilms formation, providing with a new kind of information that could be useful to understand the pathogenesis mechanisms of these bacteria.


We are currently involved in several teaching activities, mainly on bioinformatics-related topics. The recently created MSc in Bioinformatics is currently highly demanding, courses design and impartment being in charge of the Faculty of Sciences, School of Engineering, and our group at Pasteur. We also have punctual participations in several PEDECIBA courses, including topics in bioinformatics and quantitative genetics.

Human resources are clearly needed in this somewhat new research domain; this calls for our effort in such teaching activities, as well as for maximizing the number of graduate and undergraduate students in our lab (eight persons at the moment).


  1. “Investigação dos Mecanismos Genéticos e Moleculares em Biofilmes de Leptospira”. Funded by CAPES – Brazil 2012/2015. A Schnadelbach/P Ristow. Special Visiting Professor H Naya. Granted R$ 100000.
  2. “Análisis transcripcional en Leptospiras formadoras de biofilms”. Funded by ANII 2013/2015. H Naya. Granted U$S 20000.



  1. Naya DE, Naya H, Lessa EP (2016). “Brain size and thermoregulation during the evolution of the genus Homo”. Comp Biochem Physiol A Mol Integr Physiol. 191:66-73. doi:10.1016/j.cbpa.2015.09.017.
  2. Fernández-Calero T, Cabrera-Cabrera F, Ehrlich R, Marín M. “Silent Polymorphisms: Can the tRNA Population Explain Changes in Protein Properties?” Life (Basel). 2016 Feb 17;6(1). pii: E9. Review. PMID: 26901226.

  3. Sanchez AL, Urioste JI, Peñagaricano F, Neimaur K, Sienra I, Naya H, Kremer R (2016). “Genetic parameters of objectionable fibers and of their associations with fleece traits in Corriedale sheep”. Journal of Animal Science 94(1):13-20. doi: 10.2527/jas.2015-9619.


  1. Gutiérrez V, Rego N, Naya H, García G (2015). “First complete mitochondrial genome of the South American annual fish Austrolebias charrua (Cyprinodontiformes: Rivulidae): peculiar features among cyprinodontiforms mitogenomes”. BMC Genomics 16(1):879. doi: 10.1186/s12864-015-2090-3.
  2. Lasserre M, Berná L, Greif G, Díaz-Viraqué F, Iraola G, Naya H, Castro-Ramos M, Juambeltz A, Robello C (2015). “Whole-Genome Sequences of Mycobacterium bovis Strain MbURU-001, Isolated from Fresh Bovine Infected Samples”. Genome Announc 3(6). pii: e01237-15. doi: 10.1128/genomeA.01237-15.
  3. Gianola D, de Los Campos G, Toro MA, Naya H, Schön CC, Sorensen D (2015). “Do Molecular Markers Inform About Pleiotropy?” Genetics pii:genetics.115.179978.
  4. Fernandez-Calero T, Garcia-Silva R, Pena A, Robello C, Persson H, Rovira C, Naya H, Cayota A (2015). “Profiling of small RNA cargo of extracellular vesicles shed by Trypanosoma cruzi reveals a specific extracellular signature”. Mol Biochem Parasitol. pii: S0166-6851(15)00013-4. doi:10.1016/j.molbiopara.2015.03.003.


  1. Bozinovic F, Ferri-Yáñez F, Naya H, Araújo MB, Naya DE (2014). “Thermal tolerances in rodents: species that evolved in cold climates exhibit a wider thermoneutral zone”. Evolutionary Ecology Research 16: 1–10.
  2. Iraola G, Pérez R, Naya H, Paolicchi F, Pastor E, Valenzuela S, Calleros L, Velilla A, Hernández M, Morsella C (2014). “Genomic evidences for the emergence and evolution of pathogenicity and niche preferences in the genus Campylobacter”. Genome Biology and Evolution.
  3. Berná L, Iraola G, Greif G, Coitinho C, Rivas C, Naya H, Robello C (2014) “Whole Genome Sequencing of an isoniazid resistant clinical isolate of Mycobacterium tuberculosis Strain MtURU-002 from Uruguay”. Genome Announc 2(4). pii: e00655-14. doi: 10.1128/genomeA.00655-14.
  4. Zych J, Spangenberg L, Stimamiglio MA, Abud APR, Shigunov P, Marchini F, Kuligovski C, Cofré AR, Schittini AV, Aguiar AM, Senegaglia A, Brofman PRS, Goldenberg S, Dallagiovanna B, Naya H, Correa A (2014) “Polysome profiling shows the identity of human adipose-derived stromal/stem cells in detail and clearly distinguishes them from dermal fibroblasts”. Stem Cells and Development.
  5. Urioste JI, Peñagaricano F, López-Correa R, Naya H, Kremer R (2014) “Incidence and relationships of black skin spots in the fleece area and pigmentation traits in commercial Corriedale flocks”. Small Ruminant Research.
  6. Palácios F, Abreu C, Prieto D, Morande P, Ruiz S, Fernandez-Calero T, Naya H, Libisch G, Robello C, Landoni A, Gabus R, Dighiero G, Oppezzo P (2014) “Activation of the PI3K/AKT pathway by microRNA-22 results in CLL B-cell proliferation”. Leukemia. doi:10.1038/leu.2014.158.
  7. Tosar JP, Rovira C, Naya H, Cayota A (2014) “Mining of public sequencing datasets supports a non-dietary origin for putative foreign miRNAs: underestimated effects of contamination in NGS”. RNA 46(4):138-47.


  1. Greif G, Iraola G, Berna L, Coitinho C, Rivas C, Naya H, Robello C (2013) “Complete genome sequence of Mycobacterium tuberculosis strain MtURU-001, isolated from a rapidly progressing outbreak in Uruguay”. Genome Announc 2(1) pii:e01220-13. doi:10.1128/genomeA.01220-13.
  2. Laporta J, Rosa G, Naya H, Carriquiry M (2013) “Liver functional genomics in beef cows on grazing systems: novel genes and pathways revealed”. Physiological Genomics 46(4):138-47 doi:10.1152/physiolgenomics.00120.2013.
  3. Garcia-Silva MR, Cura das Neves RF, Cabrera-Cabrera F, Sanguinetti J, Medeiros LC, Robello C, Naya H, Fernandez-Calero T, Souto-Padron T, de Souza W, Cayota A (2013) “Extracellular vesicles shed by Trypanosoma cruzi are linked to small RNA pathways, life cycle regulation, and susceptibility to infection of mammalian cells”. Parasitol Res 113(1):285-304 doi:10.1007/s00436-013-3655-1.
  4. Spangenberg L, Correa A, Dallagiovanna B, Naya H (2013) “Role of alternative polyadenylation during adipogenic differentiation: an in silico approach”. PLoS ONE 8(10):e75578 doi:10.1371/journal.pone.0075578.
  5. Iraola G, Pérez R, Naya H, Paolicchi F, Harris D, Lawley TD, Rego N, Hernández M, Calleros L, Carretto L, Velilla A, Morsella C, Méndez A, Gioffre A (2013) “Complete Genome Sequence of Campylobacter fetus subsp. venerealis Biovar Intermedius, Isolated from the Prepuce of a Bull”. Genome Announc 1(4) pii:e00526-13. doi:10.1128/genomeA.00526-13.
  6. Naya DE, Spangenberg L, Naya H, Bozinovic F (2013) “Thermal conductance and basal metabolic rate are part of a coordinated system for heat transfer regulation”. Proc R Soc Lond B 280(1767):20131629.
  7. Spangenberg L, Shigunov P, Abud AP, Cofré AR, Stimamiglio MA, Kuligovski C, Zych J, Schittini AV, Costa AD, Rebelatto CK, Brofman PR, Goldenberg S, Correa A, Naya H, Dallagiovanna B (2013) “Polysome profiling shows extensive posttranscriptional regulation during human adipocyte stem cell differentiation into adipocytes”. Stem Cell Res 11(2):902-912.
  8. Fariello MI, Boitard S, Naya H, San Cristobal M, Servin B (2013) “Detecting Signatures of Selection Through Haplotype Differentiation Among Hierarchically Structured Populations”. Genetics 193(3):929-41.
  9. Espasandín AC, Urioste JI, Naya H, Alencar MM (2013) “Genotype x Production Environment Interaction for Weaning Weight in Angus Populations of Brazil and Uruguay”. Livestock science 151(2): 264-270.


  1. Naya DE, Spangenberg L, Naya H, Bozinovic F (2012) “How does evolutionary variation in basal metabolic rates arise? A statistical assessment and a mechanistic model”. Evolution 67(5):1463-76.
  2. Iraola G, Vazquez G, Spangenberg L, Naya H. (2012) “Reduced set of virulence genes allows high accuracy prediction of bacterial pathogenicity in humans”. PLoS ONE 7(8):e42144.
  3. Iriarte A, Sanguinetti M, Fernández-Calero T, Naya H, Ramón A, Musto H. (2012) “Translational selection on codon usage in the genus Aspergillus”. Gene 506(1):98-105.
  4. Trujillo AI, Peñagaricano F, Grignola MP, Nicolini P, Casal A, Espasandín AC, Naya H, Carriquiry M, Chilibroste P. (2012) “Using high resolution melting analysis to identify variation of NPY, LEP and IGF-1 genes in Angus cattle”. Livestock Science 146:193-198.
  5. Rego N, Bianchi S, Moreno P, Persson H, Kvist A, Pena A, Oppezzo P, Naya H, Rovira C, Dighiero G, Pritsch O. (2012) “Search for an aetiological virus candidate in chronic lymphocytic leukaemia by extensive transcriptome analysis”. Br J Haematol 157(6):709-17.
  6. Naya DE, Spangenberg L, Naya H, Bozinovic F. (2012) “Latitudinal patterns in rodent metabolic flexibility” Am Nat 179(6):E172-9.


  1. Gascue C, Tan PL, Cardenas-Rodriguez M, Libisch G, Fernandez-Calero T, Liu YP, Astrada S, Robello C, Naya H, Katsanis N, Badano JL. (2011) “A Direct Role of Bardet-Biedl Syndrome Proteins in Transcriptional Regulation” J Cell Science 125(Pt2):362-75.
  2. Peñagaricano F, Zorrilla P, Naya H, Robello C, Urioste JI. (2011) “Gene expression analysis identifies new candidate genes associated with the development of black skin spots in Corriedale sheep”. J Appl Genet 53(1):99-106.
  3. Spangenberg L, Battke F, Graña M, Nieselt K, Naya H. (2011) “Identifying associations between amino acid changes and meta information in alignments”. Bioinformatics 27(20):2782-9.
  4. Duhagon MA, Smircich P, Forteza D, Naya H, Williams N, Garat B. (2011). “Comparative genomic analysis of dinucleotide repeats in Tritryps”. Gene 487(1):29-37.


  1. Persson H, Kvist A, Rego N, Staaf J, Vallon-Christersson J, Luts L, Loman N, Jönsson G, Naya H, Höglund M, Borg A, Rovira C. (2010). “Identification of new microRNAs in paired normal and tumor breast tissue reveals a dual role for the ERBB2/Her2 gene”. Cancer Res 71(1):78-86.
  2. Bianchi S, Moreno P, Landoni AI, Naya H, Oppezzo P, Dighiero G, Gabus R, Pritsch O. (2010). “IGHV-D-J gene rearrangement and mutational status in Uruguayan patients with chronic lymphocytic leukemia”. Leukemia & Lymphoma 51(11):2070-8.
  3. Peñagaricano F, Urioste JI, Naya H, de los Campos G, Gianola D. (2010). “Assessment of Poisson, Probit and linear models for genetic analysis of presence and number of black spots in Corriedale sheep”. J Anim Breed Genet 128(2):105-13.
  4. González-Recio O, Weigel KA, Gianola D, Naya H, Rosa GJM. (2010). “L2-Boosting algorithm applied to high dimensional problems in genomic selection”. Genet Res 92(3):227-
  5. Hamon T, Graña M, Raggio V, Grabar N, Naya H. (2010). “Identification of relations between risk factors and their pathologies or health conditions by mining scientific literature”. Stud Health Technol Inform 160:964-8.
  6. Sabbia V, Romero H, Musto H, Naya H. (2009). “Composition profile of the human genome at the chromosome level”. J Biomol Struct Dyn 27(3):361-70.


  1. Weigel KA, de los Campos G, González-Recio O, Naya H, Wu XL, Long N, Rosa GJM, and Gianola D. (2009). “Predictive ability of direct genomic values for Lifetime Net Merit of Holstein sires using selected subsets of single nucleotide polymorphism markers”. J Dairy Sci 92(10):5248-57.
  2. Romero H, Pereira E, Naya H, Musto H. (2009). “Oxygen and GC profiles in marine environments”. J Mol Evol 69(2):203-6.
  3. de Los Campos G, Naya H, Gianola D, Crossa J, Legarra A, Manfredi E, Weigel K, Cotes JM. (2009). “Predicting Quantitative Traits with Regression Models for Dense Molecular Markers and Pedigrees”. Genetics 182(1):375-85.
  4. Garcia JM, Gao A, He PL, Choi J, Tang W, Bruzzone R, Schwartz O, Naya H, Nan FJ, Li J, Altmeyer R, Zuo JP. (2009). “High-throughput screening using pseudotyped lentiviral particles: a strategy for the identification of HIV-1 inhibitors in a cell-based assay”. Antiviral Res 81(3):239-47.


  1. Naya H, Urioste JI, Chang YM, Rodrigues-Motta M, Kremer R, Gianola D. (2008). “A comparison between Poisson and Zero-inflated Poisson regression models with an application to number of black spots in Corriedale sheep”. Gen Sel Evol 40:379-394.
  2. Rego N, Naya H, Lamolle G, Álvarez-Valin F (2008). “Evolutionary and comparative genomics of Leptospira”. RECIIS 1(2 Supl): 321-328. – NO ARBITRADA –
  3. Jubany S, Tomasco I, Ponce de León I, Medina K, Carrau F, Arrambide N, Naya H, Gaggero C. (2008). “Toward a global database for the molecular typing of Saccharomyces cerevisiae strains”. FEMS Yeast Res 8:472-84.
  4. Marton S, Garcia MR, Robello C, Persson H, Trajtenberg F, Pritsch O, Rovira C, Naya H, Dighiero H, Cayota A. (2008). “Small RNAs analysis in CLL reveals a deregulation of miRNA expression and novel miRNA candidates of putative relevance in CLL pathogenesis”. Leukemia 22:330-8.


  1. Urioste JI, Chang YM, Naya H, Gianola D. (2007). “Genetic variability in calving success in Aberdeen Angus cows under extensive recording”. Animal 1:1081-8.
  2. Sabbia V, Piovani R, Naya H, Rodriguez-Maseda H, Romero H, Musto H. (2007). “Trends of amino acid usage in the proteins from the human genome”. J Biomol Struct Dyn 25:55-9.


  1. Musto H, Naya H, Zavala A, Romero H, Alvarez-Valín F, Bernardi G. (2006). “Genomic GC level, optimal growth temperature and genome size in prokaryotes”. Biochem Biophys Res Commun 347:1-3.
  2. Naya H, Gianola D, Romero H, Urioste JI, Musto H. (2006). “Inferring parameters shaping amino-acid usage in prokaryotic genomes via Bayesian MCMC methods”. Mol Biol Evol 23:203-11.


  1. Zavala A, Naya H, Romero H, Sabbia V, Piovani R, Musto H. (2005). “Genomic GC content prediction in prokaryotes from a sample of genes”. Gene 357:137-43.
  2. Musto H, Naya H, Zavala A, Romero H, Alvarez-Valin F, Bernardi G. (2005). “The correlation between genomic G+C and optimal growth temperature of prokaryotes is robust: A reply to Marashi and Ghalanbor”. Biochem Biophys Res Commun 330:357-60.


  1. Naya H, Zavala A, Romero H, Rodriguez-Maseda H, Musto H. (2004). “Correspondence analysis of amino acid usage within the family Bacillaceae”. Biochem Biophys Res Commun 325:1252-7.
  2. Musto H, Naya H, Zavala A, Romero H, Alvarez-Valín F, Bernardi G. (2004). “Correlations between genomic GC levels and optimal growth temperatures in prokaryotes”. FEBS Letters 573(1-3):73-7.


  1. Naya H, Romero H, Zavala A, Alvarez B, Musto H. (2002). “Aerobiosis Increase Genomic GC% in Prokaryotes ”. J Mol Evol 55(3): 260-4.
  2. Zavala A, Naya H, Romero H, Musto H. (2002). “Trends in Codon and Amino Acid Usage in Thermotoga maritima ”. J Mol Evol 54(5): 563-8.
  3. Naya H, Romero H, Carels N, Zavala A, Musto H. (2001). “Translational selection shapes codon usage in the GC-rich genome of Chlamydomonas reinhardtii”. FEBS Letters 501(2-3): 127-130.