posted on 2004-01-01, 00:00authored byS Mann, Yi-Ping Phoebe Chen, L Eaton
Identification of nucleic acid sub-sequences within larger background sequences is a fundamental need of the biology community. The applicability correlates to research studies looking for homologous regions, diagnostic purposes and many other related activities. This paper serves to detail the approaches taken leading to sub-sequence identification through the use of hidden Markov models and associated scoring optimisations. The investigation of techniques for locating conserved basal promoter elements correlates to promoter thus gene identification techniques. The case study centred on the TATA box basal promoter element, as such the background is a gene sequence with the TATA box the target. Outcomes from the research conducted, highlights generic algorithms for sub-sequence identification, as such these generic processes can be transposed to any case study where identification of a target sequence is required. Paths extending from the work conducted in this investigation have led to the development of a generic framework for the future applicability of hidden Markov models to biological sequence analysis in a computational context.
History
Pagination
467 - 474
Location
Taichung, Taiwan
Open access
Yes
Start date
2004-05-19
End date
2004-05-21
ISBN-13
9780769521732
ISBN-10
0769521738
Language
eng
Publication classification
E1 Full written paper - refereed
Copyright notice
2004 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.