Generation Of Extractive Summary Based On Document Semantics
In the recent years, significant research contribution and progress observed in developing methods for machines to understand concepts within documents. For machines a document represents language based information which consist of meaningful units known as data patterns or document units. These document units are the language’s verbs, adverbs, nouns, prepositions, etc. that contributes towards building the document. The current research activities in this field, is not just limited to picking some keywords to understand the document concepts but aims to gain a precise understanding of the concepts through correlation of words and extracting sentences to obtain summaries. This would help in retrieving meaningful information and reducing the effort of going through the whole document to get its main insight.In our application, we use the Latent Semantic Analysis (LSA) algorithm for text summarization. The dataset is trained using the algorithm and a matrix is generated. This matrix gives us the correlation of words within documents. LSA uses the SVD to capture all correlations latent within a document by modelling relationships among words and sentences within the text.
- There are currently no refbacks.