how not to terminate parser prematurely

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

how not to terminate parser prematurely

humbert.tony
Hi,

I'm new to writing parsers, but parboiled examples seemed accessible enough to make me give it a try.

I want to parse input like this:  val input = List("Jan12", "Cal12", "Jan12-Apr12", "Cal11-Cal12")  

Here is what I have ...

import org.parboiled.scala._

class TermParser extends Parser {
  abstract class AstNode
  abstract class SimpleNode extends AstNode
  case class MonthNode(mon: MonCode, yy: YY) extends SimpleNode
  case class CalNode(yy: YY) extends SimpleNode
  abstract class RangeNode extends AstNode
  case class MonthRangeNode(from: MonthNode, to: MonthNode) extends RangeNode
  case class CalRangeNode(from: CalNode, to: CalNode) extends RangeNode
  case class MonCode(mon: String) extends SimpleNode
  case class YY(value: String) extends SimpleNode  
 
  def Input = rule { Expression ~ EOI }
  def Expression: Rule1[AstNode] = rule { SimpleTerm | RangeTerm } //~ optional("-" ~ SimpleTerm) }  ???
  def SimpleTerm: Rule1[SimpleNode] = rule { MonthTerm | CalTerm }
  def RangeTerm: Rule1[RangeNode] = rule { MonthRange | CalRange }
  def CalTerm: Rule1[CalNode] = rule { "Cal" ~ Year2 ~~> CalNode}
  def MonthTerm: Rule1[MonthNode] = rule { MthCode ~ Year2 ~~> ((mon, yy) => MonthNode(mon, yy)) }  
  def MonthRange = rule {MonthTerm ~ "-" ~ MonthTerm ~~> MonthRangeNode}
  def CalRange = rule {CalTerm ~ "-" ~ CalTerm ~~> CalRangeNode}
  def MthCode = rule {("Jan" | "Feb" | "Mar" | "Apr" | "May" | "Jun" |
                       "Jul" | "Aug" | "Sep" | "Oct" | "Nov" | "Dec") ~> MonCode }
  def Year2 = rule { nTimes(2, Digit) ~> YY}
  def Digit = rule { "0" - "9" }
}

object DemoSimpleTerm extends App {
 
  val parser = new TermParser {
    override val buildParseTree = true
  }
 
  val input = List("Jan12", "Cal12", "Jan12-Apr12", "Cal11-Cal12")
  val idx = 2      
   
  val parserResult = ReportingParseRunner(parser.Input).run(input(idx))  
  println(parserResult.matched)   // true, if successful
  println(parserResult.result)
  println(parserResult.valueStack)
 
 
 val parseTree = org.parboiled.support.ParseTreeUtils.printNodeTree(parserResult)
 println(parseTree)
 
 
 
 
  TracingParseRunner(parser.Input).run(input(idx))  // for debugging
 
}

When I parse "Jan12-Apr12" I fail like this

Input/Expression/SimpleTerm/MonthTerm/MthCode/FirstOf, matched, cursor at 1:4 after "Jan"
..(4)../MthCode/MthCodeAction1, matched, cursor at 1:4 after "Jan"
..(4)../MthCode, matched, cursor at 1:4 after "Jan"
..(3)../MonthTerm/Year2/2-times/Digit, matched, cursor at 1:5 after "Jan1"
..(5)../2-times/Digit, matched, cursor at 1:6 after "Jan12"
..(5)../2-times, matched, cursor at 1:6 after "Jan12"
..(4)../Year2/Year2Action1, matched, cursor at 1:6 after "Jan12"
..(4)../Year2, matched, cursor at 1:6 after "Jan12"
..(3)../MonthTerm/MonthTermAction1, matched, cursor at 1:6 after "Jan12"
..(3)../MonthTerm, matched, cursor at 1:6 after "Jan12"
..(2)../SimpleTerm, matched, cursor at 1:6 after "Jan12"
..(1)../Expression, matched, cursor at 1:6 after "Jan12"
Input/EOI, failed, cursor at 1:6 after "Jan12"
Input, failed, cursor at 1:6 after "Jan12"

I understand why it fails, it finds a SimpleTerm instead of a RangeTerm.  How to make it look and see that a "-" is coming?  Something like "Jul12-Cal13" should be invalid.

I appreciate any help.  Thank you very much,
Tony

Also, adding more simple parsers like this one to the examples would really help people to figure things out.  


Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: how not to terminate parser prematurely

mathias
Administrator
Tony,

> I understand why it fails, it finds a SimpleTerm instead of a RangeTerm.
> How to make it look and see that a "-" is coming?  Something like
> "Jul12-Cal13" should be invalid.

Your grammar suffers from a frequent problem people have with PEGs.
The thing to remember is that the "choice" operator (i.e. `|` in parboiled) is "dumb" and simply tries all its alternatives *in order*,
i.e. the order matters!

In your case you have it try the `SimpleTerm` first, so if it matches the parser will not try the second alternative.

You need to avoid to have an earlier alternative be a prefix of a latter one. If that is the case the latter will *never* match.
Theoretically parboiled could detect such a grammar mistake, we'll see whether we get it done in parboiled2.

Cheers,
Mathias

---
[hidden email]
http://www.parboiled.org

On 23.06.2013, at 20:43, humbert.tony [via parboiled users] <[hidden email]> wrote:

>
>
> Hi,
>
> I'm new to writing parsers, but parboiled examples seemed accessible enough
> to make me give it a try.
>
> I want to parse input like this:  val input = List("Jan12", "Cal12",
> "Jan12-Apr12", "Cal11-Cal12")  
>
> Here is what I have ...
>
> import org.parboiled.scala._
>
> class TermParser extends Parser {
>  abstract class AstNode
>  abstract class SimpleNode extends AstNode
>  case class MonthNode(mon: MonCode, yy: YY) extends SimpleNode
>  case class CalNode(yy: YY) extends SimpleNode
>  abstract class RangeNode extends AstNode
>  case class MonthRangeNode(from: MonthNode, to: MonthNode) extends
> RangeNode
>  case class CalRangeNode(from: CalNode, to: CalNode) extends RangeNode
>  case class MonCode(mon: String) extends SimpleNode
>  case class YY(value: String) extends SimpleNode  
>
>  def Input = rule { Expression ~ EOI }
>  def Expression: Rule1[AstNode] = rule { SimpleTerm | RangeTerm } //~
> optional("-" ~ SimpleTerm) }  ???
>  def SimpleTerm: Rule1[SimpleNode] = rule { MonthTerm | CalTerm }
>  def RangeTerm: Rule1[RangeNode] = rule { MonthRange | CalRange }
>  def CalTerm: Rule1[CalNode] = rule { "Cal" ~ Year2 ~~> CalNode}
>  def MonthTerm: Rule1[MonthNode] = rule { MthCode ~ Year2 ~~> ((mon, yy) =>
> MonthNode(mon, yy)) }  
>  def MonthRange = rule {MonthTerm ~ "-" ~ MonthTerm ~~> MonthRangeNode}
>  def CalRange = rule {CalTerm ~ "-" ~ CalTerm ~~> CalRangeNode}
>  def MthCode = rule {("Jan" | "Feb" | "Mar" | "Apr" | "May" | "Jun" |
>                       "Jul" | "Aug" | "Sep" | "Oct" | "Nov" | "Dec") ~>
> MonCode }
>  def Year2 = rule { nTimes(2, Digit) ~> YY}
>  def Digit = rule { "0" - "9" }
> }
>
> object DemoSimpleTerm extends App {
>
>  val parser = new TermParser {
>    override val buildParseTree = true
>  }
>
>  val input = List("Jan12", "Cal12", "Jan12-Apr12", "Cal11-Cal12")
>  val idx = 2      
>
>  val parserResult = ReportingParseRunner(parser.Input).run(input(idx))  
>  println(parserResult.matched)   // true, if successful
>  println(parserResult.result)
>  println(parserResult.valueStack)
>
>
> val parseTree =
> org.parboiled.support.ParseTreeUtils.printNodeTree(parserResult)
> println(parseTree)
>
>
>
>
>  TracingParseRunner(parser.Input).run(input(idx))  // for debugging
>
> }
>
> When I parse "Jan12-Apr12" I fail like this
>
> Input/Expression/SimpleTerm/MonthTerm/MthCode/FirstOf, matched, cursor at
> 1:4 after "Jan"
> ..(4)../MthCode/MthCodeAction1, matched, cursor at 1:4 after "Jan"
> ..(4)../MthCode, matched, cursor at 1:4 after "Jan"
> ..(3)../MonthTerm/Year2/2-times/Digit, matched, cursor at 1:5 after "Jan1"
> ..(5)../2-times/Digit, matched, cursor at 1:6 after "Jan12"
> ..(5)../2-times, matched, cursor at 1:6 after "Jan12"
> ..(4)../Year2/Year2Action1, matched, cursor at 1:6 after "Jan12"
> ..(4)../Year2, matched, cursor at 1:6 after "Jan12"
> ..(3)../MonthTerm/MonthTermAction1, matched, cursor at 1:6 after "Jan12"
> ..(3)../MonthTerm, matched, cursor at 1:6 after "Jan12"
> ..(2)../SimpleTerm, matched, cursor at 1:6 after "Jan12"
> ..(1)../Expression, matched, cursor at 1:6 after "Jan12"
> Input/EOI, failed, cursor at 1:6 after "Jan12"
> Input, failed, cursor at 1:6 after "Jan12"
>
> I understand why it fails, it finds a SimpleTerm instead of a RangeTerm.
> How to make it look and see that a "-" is coming?  Something like
> "Jul12-Cal13" should be invalid.
>
> I appreciate any help.  Thank you very much,
> Tony
>
> Also, adding more simple parsers like this one to the examples would really
> help people to figure things out.  
>
>
>
>
>
>
> _______________________________________________
> If you reply to this email, your message will be added to the discussion below:
> http://users.parboiled.org/how-not-to-terminate-parser-prematurely-tp4024223.html
> To start a new topic under parboiled users, email [hidden email]
> To unsubscribe from parboiled users, visit
Loading...