Conference Proceedings

Word classes in Indonesian: A linguistic reality or a convenient fallacy in natural language processing?

M Mistica, T Baldwin, IW Arka

PACLIC 25 - Proceedings of the 25th Pacific Asia Conference on Language, Information and Computation | Published : 2011


This paper looks at Indonesian (Bahasa Indonesia), and the claim that there is no noun-verb distinction within the language as it is spoken in regions such as Riau and Jakarta. We test this claim for the language as it is written by a variety of Indonesian speakers using empirical methods traditionally used in part-of-speech induction. In this study we use only morphological patterns that we generate from a pre-existing morphological analyser. We find that once the distribution of the data points in our experiments match the distribution of the text from which we gather our data, we obtain significant results that show a distinction between the class of nouns and the class of verbs in Indone..

View full abstract

Citation metrics