Quantcast

Actions problem

classic Classic list List threaded Threaded
4 messages Options
fge
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Actions problem

fge

Hello,

Still trying to parse CSS, and I am now on the Unicode range.

I want to parse an expression such as u+xxx-yyy where xxx and yyy are hexadecimal digits, with length between 1 and 6. Therefore, I have this:

    static  Action CheckMatchLen(final int min, final int max)
    {
        return new Action()
        {
            @Override
            public boolean run(final Context context)
            {
                if (max < min)
                    throw new IllegalStateException();

                final int matchLen = context.getMatch().length();
                return matchLen >= min && matchLen <= max;
            }
        };
    }

    Rule Digit()
    {
        return CharRange('0', '9');
    }

    Rule HexDigit()
    {
        return FirstOf(CharRange('a', 'f'), CharRange('A', 'F'), Digit());
    }

    Rule UnicodeCodePoint()
    {
        return Sequence(OneOrMore(HexDigit()), CheckMatchLen(1, 6));
    }
This works fine so far. But I have a problem with ranges. I try:

    Rule UnicodeRange()
    {
        return Sequence(UnicodeCodePoint(), Optional('-', UnicodeCodePoint()));
    }

    static  Action CheckUnicodeRange()
    {
        return new Action()
        {
            @Override
            public boolean run(final Context context)
            {
                final String match = context.getMatch();
                if (!match.contains("-"))
                    return true;

                final String[] range = match.split("-");
                if (range[0].length() != range[1].length())
                    return false;

                final Integer first = Integer.parseInt(range[0], 16);
                final Integer second = Integer.parseInt(range[1], 16);
                return first <= second;
            }
        };
    }

    Rule UnicodeExpression()
    {
        return Sequence(
            IgnoreCase("u+"),
            FirstOf(
                UnicodeWildcard(),
                Sequence(UnicodeRange(), CheckUnicodeRange())
            )
        );
    }
The problem: if CheckUnicodeRange() returns true, all is well. But if it returns false for whatever reason, parboiled throws an exception:
Exception in thread "main" java.lang.IllegalStateException
	at org.parboiled.common.Preconditions.checkState(Preconditions.java:119)
	at org.parboiled.parserunners.RecoveringParseRunner.run(RecoveringParseRunner.java:149)
	at org.parboiled.parserunners.AbstractParseRunner.run(AbstractParseRunner.java:81)
	at org.parboiled.parserunners.AbstractParseRunner.run(AbstractParseRunner.java:76)
Any idea why?
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Actions problem

mathias
Administrator
The problem is that you are using the RecoveringParseRunner to take care of fixing syntactical problems in your input, but then implement parts of the grammar manually (e.g. via the CheckMatchLen action). Since the RecoveringParseRunner cannot look into your syntactical predicates your logic interferes with parsing recovery.

You have two options:
1. Don't use the RecoveringParseRunner.
2. Get rid of the CheckMatchLen action (recommended).

The latter could be done like this:

        Rule UnicodeCodePoint() {
                return Sequence(
                        UnicodeCodePoint,
                        Optional(UnicodeCodePoint,
                        Optional(UnicodeCodePoint,
                        Optional(UnicodeCodePoint,
                        Optional(UnicodeCodePoint,
                        Optional(UnicodeCodePoint)))))
                );
        }

Also, your CheckUnicodeRange logic doesn't really belong into the parser!
Checking range validity is not a syntactical but a semantical operation and should therefore be performed by a higher-level layer in your application, not in the parser!

Cheers,
Mathias

---
[hidden email]
http://www.parboiled.org

On 28.12.2011, at 19:46, fge [via parboiled users] wrote:

> Hello,
>
> Still trying to parse CSS, and I am now on the Unicode range.
>
> I want to parse an expression such as u+xxx-yyy where xxx and yyy are hexadecimal digits, with length between 1 and 6. Therefore, I have this:
>
>     static  Action
>  CheckMatchLen(final int min, final int max)
>     {
>         return new Action
> ()
>         {
>             @Override
>             public boolean run(final Context
>  context)
>             {
>                 if (max < min)
>                     throw new IllegalStateException();
>
>                 final int matchLen = context.getMatch().length();
>                 return matchLen >= min && matchLen <= max;
>             }
>         };
>     }
>
>     Rule Digit()
>     {
>         return CharRange('0', '9');
>     }
>
>     Rule HexDigit()
>     {
>         return FirstOf(CharRange('a', 'f'), CharRange('A', 'F'), Digit());
>     }
>
>     Rule UnicodeCodePoint()
>     {
>         return Sequence(OneOrMore(HexDigit()), CheckMatchLen(1, 6));
>     }
>
> This works fine so far. But I have a problem with ranges. I try:
>
>     Rule UnicodeRange()
>     {
>         return Sequence(UnicodeCodePoint(), Optional('-', UnicodeCodePoint()));
>     }
>
>     static
>  Action
>  CheckUnicodeRange()
>     {
>         return new Action
> ()
>         {
>             @Override
>             public boolean run(final Context
>  context)
>             {
>                 final String match = context.getMatch();
>                 if (!match.contains("-"))
>                     return true;
>
>                 final String[] range = match.split("-");
>                 if (range[0].length() != range[1].length())
>                     return false;
>
>                 final Integer first = Integer.parseInt(range[0], 16);
>                 final Integer second = Integer.parseInt(range[1], 16);
>                 return first <= second;
>             }
>         };
>     }
>
>     Rule UnicodeExpression()
>     {
>         return Sequence(
>             IgnoreCase("u+"),
>             FirstOf(
>                 UnicodeWildcard(),
>                 Sequence(UnicodeRange(), CheckUnicodeRange())
>             )
>         );
>     }
>
> The problem: if CheckUnicodeRange() returns true, all is well. But if it returns false for whatever reason, parboiled throws an exception:
> Exception in thread "main" java.lang.IllegalStateException
> at org.parboiled.common.Preconditions.checkState(Preconditions.java:119)
> at org.parboiled.parserunners.RecoveringParseRunner.run(RecoveringParseRunner.java:149)
> at org.parboiled.parserunners.AbstractParseRunner.run(AbstractParseRunner.java:81)
> at org.parboiled.parserunners.AbstractParseRunner.run(AbstractParseRunner.java:76)
>
> Any idea why?
>
> If you reply to this email, your message will be added to the discussion below:
> http://users.parboiled.org/Actions-problem-tp3617013p3617013.html
> To start a new topic under parboiled users, email [hidden email]
> To unsubscribe from parboiled users, click here.
> NAML

fge
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Actions problem

fge
On Tue, Jan 24, 2012 at 10:38, mathias [via parboiled users]
<[hidden email]> wrote:

> The problem is that you are using the RecoveringParseRunner to take care of
> fixing syntactical problems in your input, but then implement parts of the
> grammar manually (e.g. via the CheckMatchLen action). Since the
> RecoveringParseRunner cannot look into your syntactical predicates your
> logic interferes with parsing recovery.
>
> You have two options:
> 1. Don't use the RecoveringParseRunner.
> 2. Get rid of the CheckMatchLen action (recommended).
>
> The latter could be done like this:
>
>         Rule UnicodeCodePoint() {
>                 return Sequence(
>                         UnicodeCodePoint,
>                         Optional(UnicodeCodePoint,
>                         Optional(UnicodeCodePoint,
>                         Optional(UnicodeCodePoint,
>                         Optional(UnicodeCodePoint,
>                         Optional(UnicodeCodePoint)))))
>                 );
>         }
>
> Also, your CheckUnicodeRange logic doesn't really belong into the parser!
> Checking range validity is not a syntactical but a semantical operation and
> should therefore be performed by a higher-level layer in your application,
> not in the parser!
>
> Cheers,
> Mathias
>

OK, I don't understand everything that you said, I need to understand
the fundamentals more, it seems...

Since then I have solved the problem but don't even remember how, heh.

How would you go about implementing the logic check in an "upper
layer"? I have no intention to create objects of any sort since I'm
only concerned with the validity of the input... At least for the
moment.

--
Francis Galiegue, [hidden email]
"It seems obvious [...] that at least some 'business intelligence'
tools invest so much intelligence on the business side that they have
nothing left for generating SQL queries" (Stéphane Faroult, in "The
Art of SQL", ISBN 0-596-00894-5)
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Actions problem

mathias
Administrator
Francis,

> How would you go about implementing the logic check in an "upper
> layer"? I have no intention to create objects of any sort since I'm
> only concerned with the validity of the input... At least for the
> moment.

Well, if you want to check the validity of CSS you'll probably want to do this with regard to several aspects:
1. Syntax
2. Logical problems (like inverted range boundaries)
3. Undefined properties
...

Your checker might even report CSS settings that are superfluous, like in this snippet:

        H1 {
                float: left;
                float: right;
        }

Even though valid CSS the first `float` setting will always be overridden by the second, so you might want to issue a warning here.

Clearly, putting everything into the parser layer will make your checker extremely hard to write, maintain and extend.
The better solution would be to keep the parser layer small and purely focused on syntax. It should create an AST, which is then processed by a number of "inspections" that check various aspects of validity. This would achieve a nice separation of concerns and allow you to enable/disable inspections as well as write new ones without any problems...

Cheers,
Mathias<

---
[hidden email]
http://www.parboiled.org

On 24.01.2012, at 10:45, fge [via parboiled users] wrote:

> On Tue, Jan 24, 2012 at 10:38, mathias [via parboiled users]
> <[hidden email]> wrote:
>
> > The problem is that you are using the RecoveringParseRunner to take care of
> > fixing syntactical problems in your input, but then implement parts of the
> > grammar manually (e.g. via the CheckMatchLen action). Since the
> > RecoveringParseRunner cannot look into your syntactical predicates your
> > logic interferes with parsing recovery.
> >
> > You have two options:
> > 1. Don't use the RecoveringParseRunner.
> > 2. Get rid of the CheckMatchLen action (recommended).
> >
> > The latter could be done like this:
> >
> >         Rule UnicodeCodePoint() {
> >                 return Sequence(
> >                         UnicodeCodePoint,
> >                         Optional(UnicodeCodePoint,
> >                         Optional(UnicodeCodePoint,
> >                         Optional(UnicodeCodePoint,
> >                         Optional(UnicodeCodePoint,
> >                         Optional(UnicodeCodePoint)))))
> >                 );
> >         }
> >
> > Also, your CheckUnicodeRange logic doesn't really belong into the parser!
> > Checking range validity is not a syntactical but a semantical operation and
> > should therefore be performed by a higher-level layer in your application,
> > not in the parser!
> >
> > Cheers,
> > Mathias
> >
>
> OK, I don't understand everything that you said, I need to understand
> the fundamentals more, it seems...
>
> Since then I have solved the problem but don't even remember how, heh.
>
> How would you go about implementing the logic check in an "upper
> layer"? I have no intention to create objects of any sort since I'm
> only concerned with the validity of the input... At least for the
> moment.
>
> --
> Francis Galiegue, [hidden email]
> "It seems obvious [...] that at least some 'business intelligence'
> tools invest so much intelligence on the business side that they have
> nothing left for generating SQL queries" (Stéphane Faroult, in "The
> Art of SQL", ISBN 0-596-00894-5)
>
>
> If you reply to this email, your message will be added to the discussion below:
> http://users.parboiled.org/Actions-problem-tp3617013p3684153.html
> To start a new topic under parboiled users, email [hidden email]
> To unsubscribe from parboiled users, click here.
> NAML

Loading...