SOURCE MODELS FOR NATURAL LANGUAGE

dc.contributor.authorBell, Timothy C.eng
dc.contributor.authorWitten, Ian H.eng
dc.date.accessioned2008-02-27T22:28:19Z
dc.date.available2008-02-27T22:28:19Z
dc.date.computerscience1999-05-27eng
dc.date.issued1988-10-01eng
dc.description.abstractA model of natural language is a collection of information that approximates the statistics and structure of the language being modeled. The purpose of the model may be to give insight into rules which govern how text is generated, or to predict properties of future samples of the language. This paper studies models of natural language from three different, but related, viewpoints. First, we examine the statistical regularities that are found empirically, based on the natural units of words and letters. Second, we study theoretical models of language, including simple random generative models of letters and words whose output, like genuine natural language, obeys Zipf's law. Innovation in text is also considered by modeling the appearance of previously unseen words as a Poisson process. Finally, we review experiments that estimate the information content inherent in natural text.eng
dc.description.notesWe are currently acquiring citations for the work deposited into this collection. We recognize the distribution rights of this item may have been assigned to another entity, other than the author(s) of the work.If you can provide the citation for this work or you think you own the distribution rights to this work please contact the Institutional Repository Administrator at digitize@ucalgary.caeng
dc.identifier.department1988-326-38eng
dc.identifier.doihttp://dx.doi.org/10.11575/PRISM/31193
dc.identifier.urihttp://hdl.handle.net/1880/46172
dc.language.isoEngeng
dc.publisher.corporateUniversity of Calgaryeng
dc.publisher.facultyScienceeng
dc.subjectComputer Scienceeng
dc.titleSOURCE MODELS FOR NATURAL LANGUAGEeng
dc.typeunknown
thesis.degree.disciplineComputer Scienceeng
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
1988-326-38.pdf
Size:
5.26 MB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.86 KB
Format:
Plain Text
Description: