Perpustakaan Unika Atma Jaya

Anda belum login :: 24 Nov 2024 02:55 WIB

Home

Logon

» »

Detail

Some Techniques Used for Processing Bengali Corpus to Meet New Demands of Linguistics and Language Technology

Oleh:

Dash, Niladri Sekhar

Jenis: Article from Journal
Dalam koleksi: SKASE: Journal of Theoretical Linguistics vol. 4 no. 2 (Feb. 2007), page 12-31.
Topik: frequency counts; concordance; collocation; key-word-in-context; corpus; local-word-grouping; lemmatization; parsing; language technology; applied linguistics; Bengali
Fulltext: Niladri Sekhar Dash.pdf (372.7KB)

Isi artikelThe utility of a language corpus is drastically enhanced when it is properly processed in various ways for retrieving relevant linguistic information to be used in language description and analysis as well as in various applications related to applied linguistics and language technology. Unfortunately, the text corpora developed for the Indian languages are not yet processed properly for making them useful for the tasks related to both mainstream linguistics and natural language processing. Keeping this in mind I present here in brief a few techniques of Bengali text corpus processing, which we use for various linguistic activities. These techniques, however, become far more complicated due to orthographic, morphological, and lexicological complexities involved in the language.

Opini AndaKlik untuk menuliskan opini Anda tentang koleksi ini!

Kembali

Process time: 0.015625 second(s)