Quantcast

Tree Rewrites with PEGs, Parboiled structure streams

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Tree Rewrites with PEGs, Parboiled structure streams

Eric L
Is there much in the way of supporting tree rewrites with PEGs like recent ANTLR 3.x tree rewrite support?

I've had a few emails with Roberto of LPEG fame, but he is not interested in anything other than character streams, it seems.

My goal is to have structure recognition, not just character structure recognition.

Would this be complete brain surgery for Parboiled? Not just input stream specialization?
We could rewrite AST with a string encoding, but this is inefficient and error prone.

Thanks!
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Tree Rewrites with PEGs, Parboiled structure streams

mathias
Administrator
Unfortunately there is no such thing for parboiled and it would indeed be quite some "brain surgery" to implement.
There is no abstraction over the atomic elements of an InputBuffer, they are characters, period.
Abstracting over that would greatly increase overhead and would make parboiled significantly slower.

So, for recognizing structure patterns in your AST parboiled can not really help you with... sorry.

Cheers,
Mathias

---
[hidden email]
http://www.parboiled.org

On 22.07.2011, at 17:38, Eric L [via parboiled users] wrote:

> Is there much in the way of supporting tree rewrites with PEGs like recent ANTLR 3.x tree rewrite support?
>
> I've had a few emails with Roberto of LPEG fame, but he is not interested in anything other than character streams, it seems.
>
> My goal is to have structure recognition, not just character structure recognition.
>
> Would this be complete brain surgery for Parboiled? Not just input stream specialization?
> We could rewrite AST with a string encoding, but this is inefficient and error prone.
>
> Thanks!
>
> If you reply to this email, your message will be added to the discussion below:
> http://users.parboiled.org/Tree-Rewrites-with-PEGs-Parboiled-structure-streams-tp3191743p3191743.html
> To start a new topic under parboiled users, email [hidden email]
> To unsubscribe from parboiled users, click here.

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Tree Rewrites with PEGs, Parboiled structure streams

Eric L

Pity. How is the UTF support?  Could be as simple as reserving a few control characters and relying on labels (all strings are integers anyways).  

def UP = rule { 0x2600.toChar } // or whatever
def DOWN = rule { 0x2601.toChar } // or whatever
def recognizeKids = rule { DOWN ~ "label" ~ zeroOrMore (recognizeKids) ~ UP }

Productions would need to obey as well.  In place tree rewrites would an issue.

Perhaps a new version of Parboiled where Character Type dependencies are replaced with a parameterized type, and better yet pull from and base on existing Scala tools (e.g. scala.tools.nsc.typechecker)
Also, there's no reason not to leverage type checking as part of the parser (excepting the complexity of leverage it as such).
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Tree Rewrites with PEGs, Parboiled structure streams

mathias
Administrator
Unicode is supported to the same extend as single Java chars support it: UTF-16 without supplementary characters, i.e. characters whose unicode representation requires more than 16 bits. (parboiled InputBuffers are just wrappers around Java char arrays).

In that regard you could certainly attach some special meaning to special unicode characters or even non-characters (parboiled already does this itself for internally representing certain things, see https://github.com/sirthias/parboiled/blob/master/parboiled-core/src/main/java/org/parboiled/support/Chars.java).

> Perhaps a new version of Parboiled where Character Type dependencies are replaced with a parameterized type, and better yet pull from and base on existing Scala tools (e.g. scala.tools.nsc.typechecker)

Making the InputBuffer "atomic units" a parameterized type would make it impossible to use primitives (like "char", parboiled-core is implemented in Java and therefore does not support @specialized). Without primitives parsing speed would be at least an order of magnitude lower...

> Also, there's no reason not to leverage type checking as part of the parser (excepting the complexity of leverage it as such).

I'm not quite sure I get what you mean by this...

Cheers,
Mathias

---
[hidden email]
http://www.parboiled.org

On 23.07.2011, at 16:40, Eric L [via parboiled users] wrote:

>
> Pity. How is the UTF support?  Could be as simple as reserving a few control characters and relying on labels (all strings are integers anyways).  
>
> def UP = rule { 0x2600.toChar } // or whatever
> def DOWN = rule { 0x2601.toChar } // or whatever
> def recognizeKids = rule { DOWN ~ "label" ~ zeroOrMore (recognizeKids) ~ UP }
>
> Productions would need to obey as well.  In place tree rewrites would an issue.
>
> Perhaps a new version of Parboiled where Character Type dependencies are replaced with a parameterized type, and better yet pull from and base on existing Scala tools (e.g. scala.tools.nsc.typechecker)
> Also, there's no reason not to leverage type checking as part of the parser (excepting the complexity of leverage it as such).
>
> If you reply to this email, your message will be added to the discussion below:
> http://users.parboiled.org/Tree-Rewrites-with-PEGs-Parboiled-structure-streams-tp3191743p3193577.html
> To start a new topic under parboiled users, email [hidden email]
> To unsubscribe from parboiled users, click here.

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Tree Rewrites with PEGs, Parboiled structure streams

Eric L
>Making the InputBuffer "atomic units" a parameterized type would make it impossible to use primitives (l
...
>type checker...

Sorry. This was predicated on the assumption of a new pure Scala-based parboiled-esque that could thus leverage scala.tools.nsc built-ins, and applying the equivalence of parsing to type checking.

It's now OT, though and I'll look at using reserved characters in parboiled or XText.

Thank you.
Loading...