کلیدواژهها
|
multi-document extractive summarization, optimization problem, harmony search algorithm, sentence expansion, conceptual density tuning, language-independent approach, word embedding, graph-based ranking, objective function learning, text clustering
|
چکیده
|
Today, automated extractive text summarization is one of the most common techniques for organizing information. In extractive summarization, the most appropriate sentences are selected from the text and build a representative summary. Therefore, probing for the best sentences is a fundamental task.
This paper has coped with extractive summarization as a multi-objective optimization problem and proposed a language-independent, semantic-aware approach that applies the harmony search algorithm to generate appropriate multi-document summaries. It learns the objective function from an extra set of reference summaries and then generates the best summaries according to the trained function. The system also performs some supplementary activities for better achievements. It expands the sentences by using an inventive approach that aims at tuning conceptual densities in the sentences towards important topics. Furthermore, we introduced an innovative clustering method for identifying important topics and reducing redundancies. A sentence placement policy based on the Hamiltonian shortest path was introduced for producing readable summaries.
The experiments were conducted on DUC2002, DUC2006 and DUC2007 datasets. Experimental results showed that the proposed framework could assist the summarization process and yield better performance. Also, it was able to generally outperform other cited summarizer systems.
|