Anda belum login :: 23 Nov 2024 18:24 WIB
Home
|
Logon
Hidden
»
Administration
»
Collection Detail
Detail
SusTEInability of Linguistic Resources Through Feature Structures
Oleh:
Witt, Andreas
;
Rehm, Georg
;
Hinrichs, Erhard
;
Lehmberg, Timms
;
Stegmann, Jens
Jenis:
Article from Journal - e-Journal
Dalam koleksi:
Literary and Linguistic Computing vol. 24 no. 3 (Sep. 2009)
,
page 363-372.
Fulltext:
Vol 25, 3, p 363-372.pdf
(212.07KB)
Isi artikel
This article shows that the TEI tag set for feature structures can be adopted to represent a heterogeneous set of linguistic corpora. The majority of corpora is annotated using markup languages that are based on the Annotation Graph framework, the upcoming Linguistic Annotation Format ISO standard, or according to tag sets defined by or based upon the TEI guidelines. A unified representation comprises the separation of conceptually different annotation layers contained in the original corpus data (e.g. syntax, phonology, and semantics) into multiple XML files. These annotation layers are linked to each other implicitly by the identical textual content of all files. A suitable data structure for the representation of these annotations is a multi-rooted tree that again can be represented by the TEI and ISO tag set for feature structures. The mapping process and representational issues are discussed as well as the advantages and drawbacks associated with the use of the TEI tag set for feature structures as a storage and exchange format for linguistically annotated data.
Opini Anda
Klik untuk menuliskan opini Anda tentang koleksi ini!
Kembali
Process time: 0.015625 second(s)