Jong-Bok Kim (Project Leader), School of English Kyung Hee University
Jaehyung Yang, School of Computer Engineering, Kangnam University
Key-Sun Choi, Department of Computer Science, KAIST
Jae-Woong Choe, Department of Linguistics, Korea University
Incheol Choi, Kyung Hee Language Institute, Kyung Hee University
Yasunari Harada (Project Leader), School of Law, Waseda University
Michiko Nakano, School of Education, Waseda University.
Junichi Tsujii, Department of Computer Science, Faculty of Information Science and Technology, University of Tokyo
Hiroshi Masuichi , Corporate Research Group, Fuji Xerox Co., Ltd.
Tomoko Ohkuma, Corporate Research Group, Fuji Xerox Co., Ltd.
Francis Bond, Machine Translation Research Group, NTT
Rapid and continued increase in trade and traffic among east-Asian countries including Korea and Japan is expected for a healthy economic growth of this region. Information infrastructure to support such increased interaction among people with different language and culture should incorporate intelligent servers and agents that would reduce the language barriers among participants of such cross-cultural interchange. In Korea and Japan, efforts to build systems for machine translation of text or speech-to-speech translation, bilingual and multilingual search engines, and utilization of such technology and linguistic resources that enable those applications has mostly concentrated on development of systems and resources between English and Korean and between English and Japanese, while efforts to apply such technology in developing Korean-Japanese / Japanese-Korean systems has been relatively marginal if not non-existent.
In developing a bilingual machine translation system, current research trends have emphasized statistical approaches based on bilingual corpora. However, we do not have a very large-scale bilingual corpus between Korean and Japanese, either electronically or not. An alternative approach is to employ a common grammatical framework and language-processing engine, develop grammars and lexicons for both Korea and Japanese, and develop a mediating system. Researchers at Stanford University and affiliated institutions in Europe and Japan have employed HPSG (Head-Driven Phrase Structure Grammar) formalism and LKB (Language Knowledge Base) framework in developing a translation system between English, German, French and Japanese. Similarly, researchers at PARC (former Xerox Palo Alto Research Center) and affiliated institutions have been developing grammars for English, German, French, Norwegian, Japanese and Urdu based on the LFG (Lexical Functional Grammar) formalism and XLE (Xerox Linguistic Environment).
In this project, we aim at bringing together the experience and expertise of Korean researchers in developing Korean grammar for HPSG/LKB and Japanese researchers for Japanese grammar for LFG/XLE and examine feasibility of rapidly building HPSG/LKB Japanese grammar based on Korean grammar and LFG/XLE Korean grammar based on Japanese grammar. By the end of the proposed project period, we will learn to what extent such an approach is feasible or effective, and what kind of additional research and efforts are necessary to actually build the kind of bilingual support system that has been mentioned
We hope that a successful completion of this project would bring us the following results: Better and more precise grammars of Korean and Japanese with wider coverage of actually occurring sentences. Although existing Korean HPSG/LKB grammar and Japanese LFG/XLE grammar already achieve many of the desired goals, mutual input of ideas and theoretical insights are expected to improve both grammars dramatically.
- Theoretically grounded understanding of how similar Korean and Japanese are and where they differ. Theoretical and applied linguistic research has suggested that Korean and Japanese syntax are much more similar than the similarity between the two languages as would be expected from comparison of their vocabulary. This is a gut feeling shared by theoretical linguists, but it has not been fully exploited in applications such as NLP grammar development. Such grounding would boost research in computational implementations of the Korean and Japanese grammars.
- One or two pairs of Korean and Japanese grammars within the same formalism and processing system sharing semantic representations. Such a pair would make it possible to design and implement bilingual machine translation systems, dialog systems, query-response systems, etc. without too much extra linguistic work. A closer cooperation among related researchers in the field in the two countries.