Knowledge representation and information retrieval are important aspects of
any intelligent system. Efficiency of knowledge representation depends on the language
in which the information is fed in the system and the capability of that language
towards semantic extraction. English language has been analyzed and processed by
researchers, and work is being carried out for Indian languages at various research centers like
IIIT (Hyderabad), CDAC, JNU (Delhi), etc. Tools like semantic net, conceptual
dependencies, frames, etc., are used for knowledge representation with English being the
input language. No such tools are available for the other languages; they need to
be developed. An effort in this direction is morphological analysis of Sanskrit with
respect to linguistic model (Nilson, 2002). This research work uses Sanskrit language
for knowledge representation, as it has an excellent grammatical structure. It has also
been shown that Panini Grammar Framework (PGF) can be used to develop a
suitable computational grammar for free order languages and can successfully be applied
to Indian languages (Akshar and Rajeev, 1993). The relation of Panini grammar with
order free language and Context Free Grammar (CFG) has also been depicted in the
work while establishing the relation between PGF and Western computational
framework (Akshar et al., 1995). Sanskrit language has been compared to set theory with rules
and meta rules in if-then-else form which shows that the work of Panini in 500 BC
was undoubtedly marvelous (Narsingh Rao, 2005). In this work, the use of PGF for
knowledge representation is emphasized, and a method for extracting the suffix using
transition network is described. The entire problem can be stated as follows:
Given a sentence S in Sanskrit, identify the word W, extract the suffix Sx, and using the case ending analysis, identify its role in the sentence as an agent, object,
recipient, instrument, etc. Vibhakti-karka mapping is used to identify the vibhakti and hence karka which gives its role as agent, object, etc. The outline of the algorithm is as follows:
Each of these suffixes in Sanskrit identifies not only the word as a noun,
adjective, etc., but also its thematic role in the sentence. In the previous system (Smita and
Jyoti, 2007a), a complete search of the database was recommended, but it can be
also achieved through the transition network thereby reducing the search time
and maintenance of the complete database, as shown in the following section. |