Document Type : Articles


Department of Computer Engineering, Imam Khomeini International University, Qazvin, Iran


Persian follows a concatenative morphology, where new morphemes are generated by chaining different morphemes together to form a new compound word.  Whenever, two morphemes bind to form a new morpheme, there is a possibility that the syllables at the morpheme boundaries undergo structural change.  This study suggests that these syllabic alterations may be captured using a finite state approach.  It further argues that syllabification may be incorporated into the process of lexicon building.  This approach allows the syllabification rules to be encoded in the lexical knowledge, when a lexicon is built using the finite state methods.  The rules captured here can also assist the processing of syllabic alterations in word boundaries as well.  It is particularly useful to process meter in Persian poetry.

  1. Beesley, K., & Karttunen, L. (2003). Finite State Morphology. Stanford: CSLI.
  2. Chomsky, N., & Halle, M. (1968). The Sound Pattern of English. NewYork: Harper and Row.
  3. Dehghan, M., & Kord-e Zafaranlu Kambuziya, A. (2012, January). A Short Analysis of Insertion in Persian. Theory and Practice in Language Studies, 2(1), 14-23.
  4. Hulden, M. (2005). Finite-State Syllabification. FSMNLP, volume 4002 of Lecture Notes in Computer Science (pp. 86-96). Springer.
  5. Hulden, M. (2009). Foma: a finite-state compiler and library. EACL '09 Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics: Demonstrations Session (pp. 29-32). Stroudsburg: Association for Computational Linguistics.
  6. Johnson, C. D. (1972). Formal Aspects of Phonological Description. The Hague: Mouton.
  7. Kaplan, R. M., & Kay, M. (1981). Phonological rules and finite-state transducers. Linguistic Society of America Meeting Handbook; Fifty-Sixth Annual Meeting. New York: Linguistic Society of America.
  8. Kaplan, R. M., & Kay, M. (1994). Regular models of phonological rule systems. Computational Linguistics, 20, 331-378.
  9. Karttunen, L., Chanod, J. P., Grefenstette, G., & Schiller, A. (1996). Regular expressions for language engineering. Natural Language Engineering, 2(4), 305 - 328.
  10. Koskenniemi, K. (1983). Two-Level Morphology: a General Computational Model for Word-Form Recognition and Production. Helsinki: The Department of General Linguistics, University of Helsinki.
  11. Mahdavi, M. A. (2012). A Proposed UNICODE-Based Extended Romanization System for Persian Texts. International Journal of Information Science and Management (IJISM), 10(1), 57-71.
  12. Mohri, M. (1996). On some applications of finite-state automata theory to natural language processing. Natural Language Engineering, 2(1), 61 - 80.
  13. van Noord, G., & Gerdemann, D. (2001). An extendible regular expression compiler for finite-state approaches in natural language processing. In O. Boldt, & H. Jurgensen (Ed.), Automata Implementation, 4th International Workshop on Implementing Automata. Germany: Springer.