TY - JOUR
T1 - Refined Pichia pastoris reference genome sequence
AU - Sturmberger, Lukas
AU - Chappell, Thomas
AU - Geier, Martina
AU - Krainer, Florian
AU - Day, Kasey J.
AU - Vide, Ursa
AU - Trstenjak, Sara
AU - Schiefer, Anja
AU - Richardson, Toby
AU - Soriaga, Leah
AU - Darnhofer, Barbara
AU - Birner-Gruenberger, Ruth
AU - Glick, Benjamin S.
AU - Tolstorukov, Ilya
AU - Cregg, James
AU - Madden, Knut
AU - Glieder, Anton
N1 - Copyright © 2016. Published by Elsevier B.V.
PY - 2016/10/10
Y1 - 2016/10/10
N2 - Strains of the species Komagataella phaffii are the most frequently used “Pichia pastoris” strains employed for recombinant protein production as well as studies on peroxisome biogenesis, autophagy and secretory pathway analyses. Genome sequencing of several different P. pastoris strains has provided the foundation for understanding these cellular functions in recent genomics, transcriptomics and proteomics experiments. This experimentation has identified mistakes, gaps and incorrectly annotated open reading frames in the previously published draft genome sequences. Here, a refined reference genome is presented, generated with genome and transcriptome sequencing data from multiple P. pastoris strains. Twelve major sequence gaps from 20 to 6000 base pairs were closed and 5111 out of 5256 putative open reading frames were manually curated and confirmed by RNA-seq and published LC–MS/MS data, including the addition of new open reading frames (ORFs) and a reduction in the number of spliced genes from 797 to 571. One chromosomal fragment of 76Â kbp between two previous gaps on chromosome 1 and another 134Â kbp fragment at the end of chromosome 4, as well as several shorter fragments needed re-orientation. In total more than 500 positions in the genome have been corrected. This reference genome is presented with new chromosomal numbering, positioning ribosomal repeats at the distal ends of the four chromosomes, and includes predicted chromosomal centromeres as well as the sequence of two linear cytoplasmic plasmids of 13.1 and 9.5Â kbp found in some strains of P. pastoris.
AB - Strains of the species Komagataella phaffii are the most frequently used “Pichia pastoris” strains employed for recombinant protein production as well as studies on peroxisome biogenesis, autophagy and secretory pathway analyses. Genome sequencing of several different P. pastoris strains has provided the foundation for understanding these cellular functions in recent genomics, transcriptomics and proteomics experiments. This experimentation has identified mistakes, gaps and incorrectly annotated open reading frames in the previously published draft genome sequences. Here, a refined reference genome is presented, generated with genome and transcriptome sequencing data from multiple P. pastoris strains. Twelve major sequence gaps from 20 to 6000 base pairs were closed and 5111 out of 5256 putative open reading frames were manually curated and confirmed by RNA-seq and published LC–MS/MS data, including the addition of new open reading frames (ORFs) and a reduction in the number of spliced genes from 797 to 571. One chromosomal fragment of 76Â kbp between two previous gaps on chromosome 1 and another 134Â kbp fragment at the end of chromosome 4, as well as several shorter fragments needed re-orientation. In total more than 500 positions in the genome have been corrected. This reference genome is presented with new chromosomal numbering, positioning ribosomal repeats at the distal ends of the four chromosomes, and includes predicted chromosomal centromeres as well as the sequence of two linear cytoplasmic plasmids of 13.1 and 9.5Â kbp found in some strains of P. pastoris.
KW - Journal Article
KW - Review
UR - http://www.scopus.com/inward/record.url?scp=84969921373&partnerID=8YFLogxK
U2 - 10.1016/j.jbiotec.2016.04.023
DO - 10.1016/j.jbiotec.2016.04.023
M3 - Review article
C2 - 27084056
SN - 0168-1656
VL - 235
SP - 121
EP - 131
JO - Journal of Biotechnology
JF - Journal of Biotechnology
ER -