Revistes Catalanes amb Accés Obert (RACO)

Making a dialect dictionary from a sentence based corpus

Yumi Nakajima


I describe a project making a dialect dictionary (Tokunoshima, Amami archipelago, Japan) from a
sentence-based corpus, which my colleagues Motoei Sawaki, Chistsuko Fukushima and I have been
engaged in these 13 years. The Tokunoshima dialect is in a highly critical situation, so it is urgent that we now describe the local speech of the island, which until now lacked fullAscale dictionaries. We have a well-trained informant Takahiro Okamura, who completed his original translation of "Two Thousand Sentences of Japanese” by Shigeo Kawamoto into his Asama dialect with us. “Two Thousand Sentences of Tokunoshima dialect” so vividly reflected the local life that it led us to make full use of the data as a digital dictionary of sentences with various searching functions, about which we have previously
reported at the International Society for Dialectologists and Geolinguists (SIDG) conferences. Here I
sketch some episodes from our cooperative works with Okamura, introducing how we started it and
what problems we experienced.

Text complet: PDF (English)