Skip to Main content Skip to Navigation
Journal articles

Novel methods included in SpolLineages tool for fast and precise prediction of Mycobacterium tuberculosis complex spoligotype families

Abstract : Bioinformatic tools are currently being developed to better understand the Mycobacterium tuberculosis complex (MTBC). Several approaches already exist for the identification of MTBC lineages using classical genotyping methods such as mycobacterial interspersed repetitive units-variable number of tandem DNA repeats and spoligotyping-based families. In the recently released SITVIT2 proprietary database of the Institut Pasteur de la Guadeloupe, a large number of spoligotype families were assigned by either manual curation/expertise or using an in-house algorithm. In this study, we present two complementary data-driven approaches allowing fast and precise family prediction from spoligotyping patterns. The first one is based on data transformation and the use of decision tree classifiers. In contrast, the second one searches for a set of simple rules using binary masks through a specifically designed evolutionary algorithm. The comparison with the three main approaches in the field highlighted the good performances of our contributions and the significant runtime gain. Finally, we propose the 'SpolLineages' software tool (https://github.com/dcouvin/SpolLineages), which implements these approaches for MTBC spoligotype families' identification.
Document type :
Journal articles
Complete list of metadata

https://hal-riip.archives-ouvertes.fr/pasteur-03092701
Contributor : Nalin Rastogi <>
Submitted on : Saturday, January 2, 2021 - 5:54:59 PM
Last modification on : Wednesday, January 20, 2021 - 3:15:10 AM
Long-term archiving on: : Saturday, April 3, 2021 - 8:34:23 PM

File

Database-baaa108.pdf
Publication funded by an institution

Licence


Distributed under a Creative Commons Attribution 4.0 International License

Identifiers

Collections

Citation

David Couvin, Wilfried Segretier, Erick Stattner, Nalin Rastogi. Novel methods included in SpolLineages tool for fast and precise prediction of Mycobacterium tuberculosis complex spoligotype families. Database - The journal of Biological Databases and Curation, Oxford University Press, 2020, 2020, pp.baaa108. ⟨10.1093/database/baaa108⟩. ⟨pasteur-03092701⟩

Share

Metrics

Record views

79

Files downloads

18