package ebnf
Provides classes for dealing with context-free grammars in EBNF form.
Provides a org.benknoble.ebnf.Grammar model, complete with scala DSL for creating complex grammars in code, and an org.benknoble.ebnf.EbnfParser to generate Grammars from a String.
Overview
The main class is org.benknoble.ebnf.Grammar:
scala> val grammar = Grammar( | new Production( | Nonterminal('A), | Alternation(Nonterminal('A), | Sequence(Repetition(Terminal("abc")), Option(Terminal("def")))))) grammar: org.benknoble.ebnf.Grammar = Grammar(List(Production(Nonterminal('A), Alternation(Nonterminal('A),Sequence(Repetition(Terminal(abc)),Option(Terminal(def))))))) scala> val s = grammar.format s: String = <A> ::= <A>|{'abc'}['def'] ;
If you include org.benknoble.ebnf.ExprImplicits and take advantage of org.benknoble.ebnf.Expr syntax, it looks a lot cleaner:
scala> val grammar = Grammar('A ::= 'A || Terminal("abc").* ~ "def".?) grammar: org.benknoble.ebnf.Grammar = Grammar(List(Production(Nonterminal('A), Alternation(Nonterminal('A),Sequence(Repetition(Terminal(abc)),Option(Terminal(def))))))) scala> val s = grammar.format s: String = <A> ::= <A>|{'abc'}['def'] ;
Finally, you can take advantage of org.benknoble.ebnf.EbnfParser:
scala> val grammar = EbnfParser("<A> ::= <A>|{'abc'}['def'] ;") grammar: Either[String,org.benknoble.ebnf.Grammar] = Right(Grammar(List(Production(Nonterminal('A), Alternation(Nonterminal('A),Sequence(Repetition(Terminal(abc)),Option(Terminal(def)))))))) scala> val msg = grammar.fold(s => s, g => g.format) msg: String = <A> ::= <A>|{'abc'}['def'] ;
- Alphabetic
- By Inheritance
- ebnf
- AnyRef
- Any
- Hide All
- Show All
- Public
- All
Type Members
-
case class
Alternation(left: Expr, right: Expr) extends Expr with Product with Serializable
A choice of expressions in the tree
A choice of expressions in the tree
Alternations are just their elements separated by '|':
scala> import org.benknoble.ebnf._ import org.benknoble.ebnf._ scala> val a = Alternation(Nonterminal('A), Terminal("a")).format a: String = <A>|'a'
- left
the left side of the Alternation
- right
the right side in the Alternation
-
abstract
class
Expr extends AnyRef
Base class for expressions
Base class for expressions
An expression is the right hand side of an EBNF rule, made of sequences, branches, repetitions, and options of terminal and nonterminal symbols. Each of these is represented by a concrete case class.
Provides methods common to all expressions.
-
class
Grammar extends AnyRef
A (context-free) grammar of productions
A (context-free) grammar of productions
The preferred construction method is via the companion object's apply method, which is variadic.
A Grammar is represented by ;-delimited, newline-separated Productions:
scala> val g = new Grammar(Seq(new Production(Nonterminal('A), Alternation(Terminal("abc"), Option(Nonterminal('B)))))).format g: String = <A> ::= 'abc'|[<B>] ;
- See also
-
case class
Nonterminal(name: Symbol) extends Word with Product with Serializable
A nonterminal symbol of the grammar
A nonterminal symbol of the grammar
Nonterminals look like
<Name>
:scala> import org.benknoble.ebnf._ import org.benknoble.ebnf._ scala> val n = Nonterminal('A).format n: String = <A>
- name
the name of the Nonterminal
-
case class
Option(expr: Expr) extends Expr with Product with Serializable
An optional expression.
An optional expression.
Equivalent to
Alternation(expr, ε)
. Surrounded by square brackets:scala> var o = Option(Sequence(Nonterminal('A), Terminal("abc"))).format o: String = [<A>'abc'] scala> o = Sequence(Nonterminal('A), Terminal("abc")).?.format o: String = [<A>'abc']
- expr
the optional expression
-
class
Production extends AnyRef
A production from a nonterminal to a rule
A production from a nonterminal to a rule
A production is represented as
<Nonterminal> ::= expr
:scala> val e = new Production(Nonterminal('A),Terminal("abc")).format e: String = <A> ::= 'abc'
Contains a naïve comparison for equality based on the nonterminal and expression (structural equality:
'A ::= "a"||"b"
does NOT equal'A ::= "b"||"a"
)- See also
org.benknoble.ebnf.Nonterminal's ::= syntax
-
case class
Repetition(expr: Expr) extends Expr with Product with Serializable
The Kleene-star closure of an expression
The Kleene-star closure of an expression
Repetitions are an expression surrounded by curly braces:
scala> val r = Repetition(Sequence(Nonterminal('A), Terminal("abc"))).format r: String = {<A>'abc'} scala> val r2 = Sequence(Nonterminal('A), Terminal("abc")).*.format r2: String = {<A>'abc'}
- expr
the expression to repeat
-
case class
Sequence(left: Expr, right: Expr) extends Expr with Product with Serializable
A sequence of expressions in the tree
A sequence of expressions in the tree
Sequences are just their elements concatenated. They take special care to group org.benknoble.ebnf.Alternations in parens:
scala> import org.benknoble.ebnf._ import org.benknoble.ebnf._ scala> val s = Sequence(Nonterminal('A), Terminal("a")).format s: String = <A>'a' scala> val sa = Sequence(Nonterminal('A), Alternation(Terminal("a"), Terminal("b"))).format sa: String = <A>('a'|'b')
- left
the left side of the sequence
- right
the right side in the sequence
-
case class
Terminal(s: String) extends Word with Product with Serializable
A terminal symbol of the grammar
A terminal symbol of the grammar
Terminals look like single-quoted versions of themselves:
scala> import org.benknoble.ebnf._ import org.benknoble.ebnf._ scala> val t = Terminal("abc").format t: String = 'abc'
- s
the String representing the terminal symbol
- abstract class Word extends Expr
Value Members
-
object
EbnfParser extends RegexParsers
Parser for EBNF specifications
Parser for EBNF specifications
Intended usage is via the
apply
method to Strings, or via standard parser combinator mechanisms.apply
returns either the error message or the org.benknoble.ebnf.Grammar in an Either.From Matt Might's specification
BNF
When describing languages, Backus-Naur form (BNF) is a formal notation for encoding grammars intended for human consumption.
Many programming languages, protocols or formats have a BNF description in their specification.
Every rule in Backus-Naur form has the following structure:
name ::= expansion
The symbol
::=
means "may expand into" and "may be replaced with."In some texts, a name is also called a non-terminal symbol.
Every name in Backus-Naur form is surrounded by angle brackets,
<
>
, whether it appears on the left- or right-hand side of the rule.An expansion is an expression containing terminal symbols and non-terminal symbols, joined together by sequencing and choice.
A terminal symbol is a literal like ("+" or "function") or a class of literals (like integer). It must be quoted using single quotes. Any metacharacters may appear within the quotes.
Simply juxtaposing expressions indicates sequencing.
A vertical bar
|
indicates choice.For example, in BNF, the classic expression grammar is:
<expr> ::= <term> "+" <expr> | <term> <term> ::= <factor> "*" <term> | <factor> <factor> ::= "(" <expr> ")" | <const> <const> ::= integer
EBNF
Extended Backus-Naur form (EBNF) is a collection of extensions to Backus-Naur form.
More important than the minor syntactic differences between the forms of EBNF are the additional operations it allows in expansions.
Option
In EBNF, square brackets around an expansion,
[ expansion ]
, indicates that this expansion is optional.For example, the rule:
<term> ::= [ "-" ] <factor>
allows factors to be negated.
Repetition
In EBNF, curly braces indicate that the expression may be repeated zero or more times.
For example, the rule:
<args> ::= <arg> { "," <arg> }
defines a conventional comma-separated argument list.
Grouping
To indicate precedence, EBNF grammars may use parentheses,
()
, to explictly define the order of expansion.For example, the rule:
<expr> ::= <term> ("+" | "-") <expr>
defines an expression form that allows both addition and subtraction.
Extensions
This version of EBNF admits a "comment-like" extension syntax: a
#
escapes the rest of the line, such that it is not parsed at all. Indeed, it is not kept in the grammar object at all, so the result of formatting loses the comments (considered a feature: it cleans the output for consumption by another tool).The following grammar from the tests is thus valid:
# this is a comment <A> ::= 'a' # a values | 'b' # b values | 'c' # c values ;
- Annotations
- @JSExportTopLevel( "EbnfParser" )
-
object
Expr
Provides helper methods for expressions
Provides helper methods for expressions
The most notable is
reduceTree
, which helps convert a Seq of org.benknoble.ebnf.Exprs to an org.benknoble.ebnf.Expr subtype representing an expression tree.Simple names are provided for the most common tree reductions.
Used primarly by org.benknoble.ebnf.EbnfParser to map parser results into trees.
-
object
ExprImplicits
Implicit conversions for Expr
Implicit conversions for Expr
Creates a DSL-like system, with Strings automatically converted to org.benknoble.ebnf.Terminals and Symbols converted to org.benknoble.ebnf.Nonterminal:
scala> import org.benknoble.ebnf._ import org.benknoble.ebnf._ scala> import ExprImplicits._ import ExprImplicits._ scala> val p = 'A ::= ("abc" || 'B.?) ~ 'C.* p: org.benknoble.ebnf.Production = Production(Nonterminal('A), Sequence(Alternation(Terminal(abc),Option(Nonterminal('B))),Repetition(Nonterminal('C)))) scala> val f = p.format f: String = <A> ::= ('abc'|[<B>]){<C>}
Do note that
val r: Expr = "abc".*
will not work, due to ambiguity:scala> val r: Expr = "abc".* <console>:17: error: type mismatch; found : String("abc") required: ?{def *: ?} Note that implicit conversions are not applicable because they are ambiguous: both method augmentString in object Predef of type (x: String)scala.collection.immutable.StringOps and method charToTerminal in object ExprImplicits of type (s: String)org.benknoble.ebnf.Terminal are possible conversion functions from String("abc") to ?{def *: ?} val r: Expr = "abc".* ^ <console>:17: error: value * is not a member of String val r: Expr = "abc".* ^
-
object
Grammar
Factory for org.benknoble.ebnf.Grammar instances.
- object Main
-
object
ε extends Word with Product with Serializable
The empty expression (pronounced "epsilon")