RefDoc
Haut

Faire une nouvelle recherche
Make a new search
Lancer la recherche


Titre du document / Document title

A formal language model for parsing SGML

Auteur(s) / Author(s)

MATZEN R. W. (1) ; GEORGE K. M. (1) ; HEDRICK G. E. (1) ;

Affiliation(s) du ou des auteurs / Author(s) Affiliation(s)

(1) Computer Science Department, Oklahoma State University, Stillwater, Oklahoma, ETATS-UNIS

Résumé / Abstract

The Standard Generalized Markup Language (SGML) is an international standard for document definition (ISO 8879) that was adopted in 1986 and is rapidly gaining acceptance in industry and government. It is a meta-language system for document design rather than a specific scheme for document processing; almost any kind of document can be described using SGML. Productions called element declarations are used to define arbitrary elements of documents and the context in which they can occur. A finite set of element declarations called a document type definition (DTD) defines the high-level syntax of a set of documents. DTDs are similar to context-free grammars, but the productions are more complex. The standard does not describe a formal language model for SGML, and there is little work in the literature on this topic. This article defines a formal language model for SGML; systems of finite automata from systems of regular expressions. This model is applied in two ways: a parser is constructed for DTDs, and methods are shown for automatically constructing parsers for the documents defined by a DTD. These methods for parsing SGML are new, and they include features of DTDs that have not previously been included in a static language model. The model applies directly to the syntactic constructs of SGML, and thus, the methods shown in this article have distinct advantages for parsing SGML over traditional context-free parsing methods.

Revue / Journal Title

The Journal of systems and software    ISSN  0164-1212   CODEN JSSODM 

Source / Source

1997, vol. 36, no2, pp. 147-166 (17 ref.)

Langue / Language

Anglais

Editeur / Publisher

Elsevier, New York, NY, ETATS-UNIS  (1979) (Revue)

Mots-clés anglais / English Keywords

Software engineering

;

Programming language

;

Formal language

;

Finite automaton

;

Grammar

;

Information system

;

Document processing

;

Mots-clés français / French Keywords

Génie logiciel

;

Langage programmation

;

Langage formel

;

Automate fini

;

Grammaire

;

Système information

;

Traitement document

;

Langage SGML

;

Mots-clés espagnols / Spanish Keywords

Ingeniería logiciel

;

Lenguaje programación

;

Lenguaje formal

;

Autómata estado finito

;

Gramática

;

Sistema información

;

Tratamiento documento

;

Localisation / Location

INIST-CNRS, Cote INIST : 18071, 35400006331566.0040

Nº notice refdoc (ud4) : 2575525



Faire une nouvelle recherche
Make a new search
Lancer la recherche
Bas