Book Chapter

Sharing data in small and endangered languages: Cataloging and metadata, formats, and encodings

N Thieberger, M Jacobson

Language Documentation: Practice and values | Published : 2010


Speakers of small or 'under-resourced' languages often first contact the world of Information Technology via the effort of field linguists. Good practices in linguistic data management include the separation of structure and content and of data and metadata formats. Primary outputs of field research (lexicon, transcripts and interlinear glossed text collections, and their associated media) need to be coded and preserved. Long-term access to these data is addressed by the establishment of archives that also act as the locus for training and advocacy for well-formed data. In this paper we discuss two such archives, one in Australia, the Pacific and Regional Archive for Digital Sources in Endan..

View full abstract

University of Melbourne Researchers

Citation metrics