Charles Muller, Toyo Gakuen University
During the past five years, we have witnessed the beginnings of what will eventually turn into sweeping and dramatic changes in the way that students of East Asian literature, history, philosophy, art, and religion do their research. The availability of computers, which are able to store, search, and display digital texts is bringing with it transformations that will be far greater than previous media-paradigm shifts, such as that of the invention of the printing press.
We can now place our objects of research on our screen for analysis, and when we want to find the meaning of a word or phrase, we can search for it instantaneously in a digital dictionary. We can also quickly determine its location in a textual corpus, and examine its original context. These kinds of capabilities are especially valuable for those, such as myself, who are translating pre-modern and classical texts. Once a text is on the computer screen, the rapidity and ease with which we can find relevant information is such that old-fashioned methods of using paper reference works cannot but rapidly fall by the wayside.
This is, however, all speaking in terms of optimum conditions. In fact, despite the valiant efforts of a number of individuals and organizations, the present situation is such that only a small portion of the literary corpora for the various areas of East Asian cultural studies, and especially pre-modern cultural studies, are actually available in digital format. At present, an important segment of the Chinese literary corpus is available from Academia Sinica[1]; the Tripitaka Koreana (Seoul)[2], CBETA (Taipei) and SAT (Tokyo)[3], are rapidly making available various version of the Chinese Buddhist canon. But when one considers the totality of religious, philosophical, historical, and literary texts contained in the entire classical East Asian corpus, what has been properly digitized to date is only a miniscule portion, a mere beginning. Thus, when a scholar such as myself desires to work with a text in digital format, and that text does not yet exist in digital format, I am faced with the dilemma of either going back to the "old way" of working with a hard copy on my desk, or going ahead and digitizing the text myself, either by OCR, or direct typed input.
I recently came across a perfect example of this case when I decide to
translate a late Koryŏ Neo-Confucian anti-Buddhist polemical work by Chŏng
Tojŏn ![]()
![]()
entitled
Pulssi chappyŏn (Arguments Against the Buddhists ![]()
![]()
![]()
).
After searching around a bit and finding out that a digital version did not
exist, I found myself faced with the option of either going back to the old way
of translation (so doing, effectively cutting myself off from the possibility of
using the dictionaries that I have labored to develop, and all the other digital
tools presently available) or inputting the text myself. Since the version I
have was written in a late Koryŏ script style, I could not scan and OCR it,
and so I was faced with the task of typing in 15 pages of Chinese from scratch.
Two years ago, I would probably not have attempted the task--the main reason being the fact that while common characters are easily entered by pronunciation, many older and difficult characters that I do not know the pronunciation for, or which I do know the pronunciation of, but are not registered in my IME user's dictionary, would just take too much time--enough to make the task untenable. Even worse would be the job of searching for extremely rare characters, which might not be contained in Unicode, or perhaps not even in Morohashi.
However, I did input the entire text in about two weeks, and with only one
single unidentified character left over. How was this possible? Because of Konjaku
Mojikyo.
Using the Mojikyo, I was able to look up the difficult characters quickly, by
borrowing components from characters for which I already knew the pronunciation.
The Mojikyo lookup applet allows one to paste in known characters, and then
dissect them into their components--gradually, or all at once. These components
can then be used as the basis for a new search, after which the newly located
character can be pasted into the text. And since Mojikyo keeps a record of
recently searched characters, I don't need to do another search if I come across
the same difficult character two hours later. I can just go to the "Search
History" window, and retrieve the character again.
The most critical advantage for me in using Mojikyo for this particular job, however, was in locating extremely rare characters not contained in any of the lexicons available to me. By plugging in the components that I could determine from the source text, Mojikyo could offer a broad selection of graphically related characters, from among which I could almost invariably find the character that I was looking for. In most cases it turned out that the character that I was looking for was after all, not that unusual--it was just an idiosyncratic variant of a rather well known character. Furthermore, since Mojikyo comes with a complete True Type Font set, I can actually display these variants in my text, and will be able to print them out later, when I publish my translation.
We have also been using Mojikyo extensively in our work of digitizing the indexes of East Asian Buddhist reference works[4], in basically the same fashion--using Mojikyo to locate difficult characters. An added facet of Mojikyo that has been helpful is the attitude of eager cooperation demonstrated by the Mojikyo team. During the process of digitization of a few major dictionary indexes, we came across fifty or so characters that were not yet included in the Mojikyo font set. When we forwarded the graphic and phonetic information on these characters to Mojikyo, they quickly incorporated these characters into their system, making new fonts. The Mojikyo team has been assisting the CBETA and SAT projects in the same way.
In these ways then, the Mojikyo project is providing an invaluable service to the scholarly community, with an attitude of open sharing and cooperation. They deserve our fullest support.