TY - CHAP
T1 - Improving neural network promoter prediction by exploiting the lengths of coding and non-coding sequences
AU - Caldwell, Rachel
AU - Dai, Yun
AU - Srivastava, Sheenal
AU - Lin, Yan Xia
AU - Zhang, Ren
PY - 2008
Y1 - 2008
N2 - Since the release of the first draft of the entire human DNA sequence in 2001, researchers have been inspired to continue with sequencing many other organisms. This has lead to the creation of comprehensive sequencing libraries which are available for intensive study. The most recent genome to be sequenced has been the gray, short-tailed opossum (Monodelphis domestica). This metatherian ("marsupial") species, which is the first of its type to be sequenced, may offer researchers an insight not only into the evolution of mammalian genomes in respect to the architecture and functional organization, but may also tender an understanding in the human genome [12]. Much attention within computational biology research has focused on identifying gene products and locations from experimentally obtained DNA sequences. The use of promoter sequence prediction and positions of the transcription start sites can inevitably facilitate the process of gene finding in DNA sequences. This can be more beneficial if the organisms of interest are higher eukaryotes, where the coding regions of the genes are situated in an expanse of non-coding DNA. With the genomes of numerous organisms now completely sequenced, there is a potential to gain invaluable biological information from these sequences. Computational prediction of promoters from the nucleotide sequences is one of the most attractive topics in sequence analysis today. Current promoter prediction algorithms employ several gene features for prediction. These attributes include homology with known promoters, the presence of particular motifs within the sequence, DNA structural characteristics and the relative signatures of different regions in the sequence.
AB - Since the release of the first draft of the entire human DNA sequence in 2001, researchers have been inspired to continue with sequencing many other organisms. This has lead to the creation of comprehensive sequencing libraries which are available for intensive study. The most recent genome to be sequenced has been the gray, short-tailed opossum (Monodelphis domestica). This metatherian ("marsupial") species, which is the first of its type to be sequenced, may offer researchers an insight not only into the evolution of mammalian genomes in respect to the architecture and functional organization, but may also tender an understanding in the human genome [12]. Much attention within computational biology research has focused on identifying gene products and locations from experimentally obtained DNA sequences. The use of promoter sequence prediction and positions of the transcription start sites can inevitably facilitate the process of gene finding in DNA sequences. This can be more beneficial if the organisms of interest are higher eukaryotes, where the coding regions of the genes are situated in an expanse of non-coding DNA. With the genomes of numerous organisms now completely sequenced, there is a potential to gain invaluable biological information from these sequences. Computational prediction of promoters from the nucleotide sequences is one of the most attractive topics in sequence analysis today. Current promoter prediction algorithms employ several gene features for prediction. These attributes include homology with known promoters, the presence of particular motifs within the sequence, DNA structural characteristics and the relative signatures of different regions in the sequence.
UR - http://www.scopus.com/inward/record.url?scp=44649168257&partnerID=8YFLogxK
U2 - 10.1007/978-3-540-78297-1_10
DO - 10.1007/978-3-540-78297-1_10
M3 - Chapter
SN - 9783540782964
VL - 116
T3 - Studies in Computational Intelligence
SP - 213
EP - 230
BT - Advances of Computational Intelligence in Industrial Systems
ER -