Computer Sciences Journal | Summarization of Scientific Papers Through Extraction Technique

The IUP Journal of Computer Sciences :

Summarization of Scientific Papers Through Extraction Technique

Article Details

Pub. Date	:	April, 2011
Product Name	:	The IUP Journal of Computer Sciences
Product Type	:	Article
Product Code	:	IJCS51104
Author Name	:	Shanmugasundaram Hariharan and Bhaskaran Raman
Availability	:	YES
Subject/Domain	:	Management
Download Format	:	PDF Format
No. of Pages	:	15

Price

For delivery in electronic format: Rs. 50; For delivery through courier (within India): Rs. 50 + Rs. 25 for Shipping & Handling Charges

Download

To download this Article click on the button below:

Abstract

Research on summarization has been a steady subject of interest. In the coming years, this interest rate would definitely get a boost due to availability of information online. Due to rapid growth in research, summarizing electronic web documents (research papers) has received reasonable importance. Every researcher or academician browses through literature reviews continuously to update his knowledge. However, in recent years, this attitude has reduced drastically due to the fact that the scientific literature papers are quite lengthy and difficult to read. Lack of interest in going through the entire content leads to poor knowledge sharing, resulting in ignorance of current updates. Hence, an effective summary of each such cited article is the only solution for this issue. This paper mainly focuses on summarizing cited research articles, thereby producing a ready reference to the user. Several approaches to score the sentences using the extraction technique have also been proposed.

Description

Text summarization is a technique where the computer automatically creates an abstract or summary of one or more texts. Initial interest towards automatic summarization started in the 1960s in American research libraries (Luhn, 1958; and Edmundson, 1969). As the amount of online information increases, more and more effort is dedicated to create automatic summarization systems. Since the automatic text summarization is largely a language-specific task, there has been a necessity to develop efficient algorithms for it. A summary can be loosely defined as a text that is produced from one or more texts that conveys important information in the original text and that is not longer than half of the original text. In other words, the main goal of a summary is to present the main ideas in a document in less space. Automatic text summarization is a multifaceted endeavor that typically branches out in several dimensions (Sparck- Jones, 1999).

Internet provides us with new perspectives, making the exchange of information not only easier than ever but also virtually unrestricted. A person who wishes to know the current happenings of an event via Internet surfs a number of news sites available. Mostly, he spends a lot of time reading different papers which have the same information scattered in different ways. A researcher or academician tries to update his knowledge by reading through the literature reviews published by different media. However, it is not possible to read through the contents completely, as scientific papers are updated day-to-day. There are millions of documents available on the web either in new or repeated forms. It is very difficult for researchers to read the entire document line by line to get the important points since it is time-consuming and difficult to understand. The user has to spend a lot of time in reading, which may result in errors like leaving the important points unread. There are chances that the user may leave the entire document unread and may wish for a more simplified version.

Keywords

Computer Sciences Journal, Scientific Papers, Electronic Web Documents, Automatic Summarization Systems, Text Summarization, Natural Language Processing, ExtraGen, Text Analysis Systems, Keyword-based Scoring, Monolingual Systems, Multilingual Scientific Summaries, Format-based Scoring.