TY - JOUR
T1 - Hla-mapper: An application to optimize the mapping of HLA sequences produced by massively parallel sequencing procedures
AU - C. Castelli, Erick
AU - Almeida da Paz, Michelle
AU - S. Souza, Andreia
AU - Ramalho, Jaqueline
AU - Teixeira Mendes-Junior, Celso
PY - 2018
Y1 - 2018
N2 - A challenging task when more than one HLA gene is evaluated together by second-generation sequencing is to achieve a reliable read mapping. The polymorphic and repetitive nature of HLA genes might bias the read mapping process, usually underestimating variability at very polymorphic segments, or overestimating variability at some segments. To overcome this issue we developed hla-mapper, which takes into account HLA sequences derived from the IPD-IMGT/HLA database and unpublished HLA sequences to apply a scoring system. This comprehends the evaluation of each read pair, addressing them to the most likely HLA gene they were derived from. Hla-mapper provides a reliable map of HLA sequences, allowing accurate downstream analysis such as variant calling, haplotype inference, and allele typing. Moreover, hla-mapper supports whole genome, exome, and targeted sequencing data. To assess the software performance in comparison with traditional mapping algorithms, we used three different simulated datasets to compare the results obtained with hla-mapper, BWA MEM, and Bowtie2. Overall, hla-mapper presented a superior performance, mainly for the classical HLA class I genes, minimizing wrong mapping and cross-mapping that are typically observed when using BWA MEM or Bowtie2 with a single reference genome.
AB - A challenging task when more than one HLA gene is evaluated together by second-generation sequencing is to achieve a reliable read mapping. The polymorphic and repetitive nature of HLA genes might bias the read mapping process, usually underestimating variability at very polymorphic segments, or overestimating variability at some segments. To overcome this issue we developed hla-mapper, which takes into account HLA sequences derived from the IPD-IMGT/HLA database and unpublished HLA sequences to apply a scoring system. This comprehends the evaluation of each read pair, addressing them to the most likely HLA gene they were derived from. Hla-mapper provides a reliable map of HLA sequences, allowing accurate downstream analysis such as variant calling, haplotype inference, and allele typing. Moreover, hla-mapper supports whole genome, exome, and targeted sequencing data. To assess the software performance in comparison with traditional mapping algorithms, we used three different simulated datasets to compare the results obtained with hla-mapper, BWA MEM, and Bowtie2. Overall, hla-mapper presented a superior performance, mainly for the classical HLA class I genes, minimizing wrong mapping and cross-mapping that are typically observed when using BWA MEM or Bowtie2 with a single reference genome.
KW - MHC
KW - HLA
KW - Next Generation Sequencing (NGS)
KW - Second Generation Sequencing
KW - Variability
KW - Polymorphisms
KW - Typing
KW - Aligners
KW - Mapping tool
U2 - 10.1016/j.humimm.2018.06.010
DO - 10.1016/j.humimm.2018.06.010
M3 - Article
SN - 0198-8859
VL - 79
SP - 678
EP - 684
JO - Human Immunology
JF - Human Immunology
IS - 9
ER -