TY - JOUR
T1 - Comprehensive analysis of the N-glycan biosynthetic pathway using bioinformatics to generate UniCorn
T2 - A theoretical N-glycan structure database
AU - Akune, Yukie
AU - Lin, Chi Hung
AU - Abrahams, Jodie L.
AU - Zhang, Jingyu
AU - Packer, Nicolle H.
AU - Aoki-Kinoshita, Kiyoko F.
AU - Campbell, Matthew P.
PY - 2016/8/5
Y1 - 2016/8/5
N2 - Glycan structures attached to proteins are comprised of diverse monosaccharide sequences and linkages that are produced from precursor nucleotide-sugars by a series of glycosyltransferases. Databases of these structures are an essential resource for the interpretation of analytical data and the development of bioinformatics tools. However, with no template to predict what structures are possible the human glycan structure databases are incomplete and rely heavily on the curation of published, experimentally determined, glycan structure data. In this work, a library of 45 human glycosyltransferases was used to generate a theoretical database of N-glycan structures comprised of 15 or less monosaccharide residues. Enzyme specificities were sourced from major online databases including Kyoto Encyclopedia of Genes and Genomes (KEGG) Glycan, Consortium for Functional Glycomics (CFG), Carbohydrate-Active enZymes (CAZy), GlycoGene DataBase (GGDB) and BRENDA. Based on the known activities, more than 1.1 million theoretical structures and 4.7 million synthetic reactions were generated and stored in our database called UniCorn. Furthermore, we analyzed the differences between the predicted glycan structures in UniCorn and those contained in UniCarbKB (www.unicarbkb.org), a database which stores experimentally described glycan structures reported in the literature, and demonstrate that UniCorn can be used to aid in the assignment of ambiguous structures whilst also serving as a discovery database.
AB - Glycan structures attached to proteins are comprised of diverse monosaccharide sequences and linkages that are produced from precursor nucleotide-sugars by a series of glycosyltransferases. Databases of these structures are an essential resource for the interpretation of analytical data and the development of bioinformatics tools. However, with no template to predict what structures are possible the human glycan structure databases are incomplete and rely heavily on the curation of published, experimentally determined, glycan structure data. In this work, a library of 45 human glycosyltransferases was used to generate a theoretical database of N-glycan structures comprised of 15 or less monosaccharide residues. Enzyme specificities were sourced from major online databases including Kyoto Encyclopedia of Genes and Genomes (KEGG) Glycan, Consortium for Functional Glycomics (CFG), Carbohydrate-Active enZymes (CAZy), GlycoGene DataBase (GGDB) and BRENDA. Based on the known activities, more than 1.1 million theoretical structures and 4.7 million synthetic reactions were generated and stored in our database called UniCorn. Furthermore, we analyzed the differences between the predicted glycan structures in UniCorn and those contained in UniCarbKB (www.unicarbkb.org), a database which stores experimentally described glycan structures reported in the literature, and demonstrate that UniCorn can be used to aid in the assignment of ambiguous structures whilst also serving as a discovery database.
KW - Glycoinformatics
KW - N-glycan synthetic pathway
KW - Human glycosyltransferases
UR - http://www.scopus.com/inward/record.url?scp=84974727237&partnerID=8YFLogxK
U2 - 10.1016/j.carres.2016.05.012
DO - 10.1016/j.carres.2016.05.012
M3 - Article
C2 - 27318307
AN - SCOPUS:84974727237
SN - 0008-6215
VL - 431
SP - 56
EP - 63
JO - Carbohydrate Research
JF - Carbohydrate Research
ER -