CAT.INIST
Accueil du sitewww.cnrs.frwww.inist.frOther CNRS


   Envoyer le lien de cette référence    Imprimer / Print    Retour / Back

Titre du document / Document title

Résumé automatique de texte avec un algorithme d'ordonnancement = Automatic text summarization with a scheduling algorithm

Auteur(s) / Author(s)

USUNIER Nicolas (1) ; AMINI Massih-Reza (1) ; GALLINARI Patrick (1) ;

Affiliation(s) du ou des auteurs / Author(s) Affiliation(s)

(1) Laboratoire d'Informatique de Paris 6 8, rue du Capitaine Scott, 75015 Paris, FRANCE

Résumé / Abstract

This paper investigates a new approach for automatic text summarization based on a Machine Learning (ML) ranking algorithm. Previous ML approaches defined a set of features which were used to produce a vector ofscores for each sentence in a given document and trained a classifier to make a global combination of these scores. The goal is to extract a subset of a document which most reflects its content. However, recent theoretical results suggest that the classification criterion may be suboptimal for learning scoring functions. Therefore, we propose to use ranking algorithms, which also combine the scores of different features but using a criterion which tends to reduce the relative misordering of sentences within a document. Features we use here are either based on the state-of-the-art or built upon word-clusters. These clusters are groups of words which often cooccur with each other, and can serve to expand a query or to enrich the representation of the sentences of the documents. We empirically show that the features used as well as the ranking algorithms outperforms state-of-the-art approaches on two distinct datasets.

Revue / Journal Title

Ingénierie des systèmes d'information   ISSN 1633-1311 

Source / Source

2006, vol. 11, no 2 (107 p.)  [Document : 21 p.] (1 p.1/4), pp. 71-91 [21 page(s) (article)]

Langue / Language

Français

Editeur / Publisher

Lavoisier, Paris, FRANCE  (2001) (Revue)

Mots-clés anglais / English Keywords

Learning algorithm ; Scheduling ; Sentence ; Hierarchical classification ; Abstract ; Database query ; Artificial intelligence ; Text ; Information system ;

Mots-clés français / French Keywords

Algorithme apprentissage ; Ordonnancement ; Phrase ; Classification hiérarchique ; Résumé ; Interrogation base donnée ; Intelligence artificielle ; Texte ; Système information ;

Mots-clés espagnols / Spanish Keywords

Algoritmo aprendizaje ; Reglamento ; Frase ; Clasificación jerarquizada ; Resumen ; Interrogación base datos ; Inteligencia artificial ; Texto ; Sistema información ;

Localisation / Location

INIST-CNRS, Cote INIST : 26729, 35400014262316.0040

Nº notice refdoc (ud4) : 17782331

   Envoyer le lien de cette référence    Imprimer / Print    Retour / Back


Custom Search