Xpro: database of eukaryotic protein-encoding genes

Vivek Gopalan, Tin Wee Tan, Bernett T K Lee, Shoba Ranganathan*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

22 Citations (Scopus)


Xpro is a relational database that contains all the eukaryotic protein-encoding DNA sequences contained in GenBank with associated data required for the analysis of eukaryotic gene architecture. In addition to the information found in the GenBank records, which includes properties such as sequence, position, length and description about introns, exons and protein-coding regions, Xpro provides annotations on the splice sites and intron phases. Furthermore, Xpro validates intron positions using alignment information between the record's sequence and EST sequences found in dbEST. In the process of validation, alternative splicing information is also obtained and can be found in the database. The intron-containing genes in the Xpro are also classified as experimental or predicted based on the intron position validation and specific keywords in the GenBank records that are present in predicted genes. An Entrez-like query system, which is familiar to most biologists, is provided for accessing the information present in the database system. A non-redundant set of Xpro database contents is also obtained by cross-referencing to the Swiss-Prot/TrEMBL and Pfam databases. The database currently contains information for 493 983 genes - 351 918 introncontaining genes and 142 065 intron-less genes. Xpro is updated for each new GenBank release and is freely available via the internet at http://origin.bic.nus.edu.sg/xpro.

Original languageEnglish
Pages (from-to)D59-D63
Number of pages5
JournalNucleic Acids Research
Issue numberDatabase issue
Publication statusPublished - 1 Jan 2004
Externally publishedYes


Dive into the research topics of 'Xpro: database of eukaryotic protein-encoding genes'. Together they form a unique fingerprint.

Cite this