项目作者: akulisek

项目描述 :
Simple parser for modified basicDTD LL(1) grammar written in Python
高级语言: Python
项目地址: git://github.com/akulisek/basic_DTD_parser.git
创建时间: 2016-11-11T13:43:35Z
项目社区:https://github.com/akulisek/basic_DTD_parser

开源协议:

下载


basic_DTD_parser

BNF (Backus-Naur Form) of basicDTD:

  1. dtddocument ::= declaration {declaration} .
  2. declaration ::= attrdecl | elemdecl .
  3. elemdecl ::= '<!ELEMENT' name ('EMPTY' | 'ANY' | '(#PCDATA)' | elemchild) '>' .
  4. elemchild ::= '(' (choice | seq)['?' | '*' | '+'] ')' .
  5. choice ::= '(' cp ['|' cp] ')' .
  6. seq ::= '(' cp {',' cp} ')' .
  7. cp ::= (name | choice | seq) ['?' | '*' | '+'] .
  8. attrdecl ::= '<!ATTLIST' name {name attrtype defaultdecl} '>' .
  9. attrtype ::= 'CDATA' | 'NMTOKEN' | 'IDREF' | '(' word ['|' word] ')' . defaultdecl ::= '#REQUIRED' | '#IMPLIED' | (['#FIXED'] '"' word {word} '"' ) .
  10. name ::= (letter | '_' | ':') {namechar} .
  11. namechar ::= letter | digit | '.' | '-' | '_' | ':' .
  12. letter ::='A'|..|'Z'|'a'|..|'z'.
  13. number ::= digit {digit} .
  14. digit ::= '0' | .. | '9' .
  15. word ::= char {char} .
  16. char ::= letter | digit | '%' | '&' | '^' .

BNF grammar transformed into LL(1) grammar (rules form):

  1. DTDOC -> '<' DECLARATION L
  2. L -> DTDOC | ε
  3. DECLARATION -> ATTLIST WORD Z '>' | ELEMENT WORD X '>'
  4. X -> EMPTY| ANY | PCDATA | '(' F ')'
  5. F -> '(' CP K Y
  6. Y -> '?' | '*' | '+' | ε
  7. H -> ',' CP H | ε
  8. K -> ')' | '|' CP ')' | ',' CP H ')'
  9. CP -> WORD Y | '(' CP K Y
  10. Z -> WORD ATTRTYPE DEFAULTDECL Z | ε
  11. ATTRTYPE -> CDATA | NMTOKEN | IDREF | '(' WORD E ')'
  12. E -> '|' WORD | ε
  13. DEFAULTDECL -> REQUIRED | IMPLIED | J '"' WORD B '"'
  14. J -> FIXED | ε
  15. B -> WORD B | ε

Note that the grammar (in BNF) had to be slightly modified in order to transform it into a LL(1) grammar (e.g., merging rules name and word together).

Usage

  1. Single sentence: python parser.py '<!ELEMENT integer ((bool,string,float)*)>'
  2. Multiple sentences: python parser.py '<!ELEMENT integer ((bool,string,float)*)> <!ELEMENT integer ((bool,string,float)*)>'
  3. Help: python parser.py -h