Quantcast

Re: Best way to have sequence of optional tokens with separator inbetween?

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Best way to have sequence of optional tokens with separator inbetween?

mathias
Administrator
How about you go for something like this (in Scala for brevity):

    val typeVariants: Map[String, List[String]] = ... // all the legal variants of all types

    // convert the List[String] of all types into a FirstOf of all variants
    val typeVariantRules: List[Rule] =
      typeVariants.values.map(variants => FirstOf(variants.toArray)).toList

    // walk through all typeVariantRules from left to right and combine
    typeVariantRules.reduceLeft { (left, right) =>
      FirstOf(
        Sequence(left, Optional(Sequence(separator, right)),
        right
      )
    }

I think this should do what you want, no?

Cheers,
Mathias

---
[hidden email]
http://www.parboiled.org

On 03.07.2012, at 08:45, slow [via parboiled users] wrote:

> I currently have a working solution, but I feel like there's a simpler/easier way of representing this.
>
> I need to be able to parse these valid examples:
> A/B/C/D
> A2/B/C/D
> A, B, C
> A3/C
> B/C/D
>
> Invalid:
> A/A2
> D, C, B, A
>
> So there's a known restricted vocabulary of:
> A,A2,A3,B,C,D,D2,D3,...(100s more)
> where numbers signify a simplified/acronymized variant of the same type.
>
> Note that only one token of the same type can exist at a time.  And types must come in order and are separated by some string.
>
> Here's what I have working:
>
>   Rule SequenceOfSeparatedOptionals(String separator, Index tokens) {
>     Rule maybe = FirstOf(tokens.possible());
>     return Sequence(
>         Sequence(maybe, push(tokens.lookup(match(), 0))),
>         ZeroOrMore(
>             separator,
>             maybe,
>             push(tokens.lookup(match(), (int) pop() + 1)) && (int) peek() >= 0));
>   }
>
> class Index {
>   private final Map<String, Integer> index;
>
>   // E.G. { "A":1, "A1":1, "B":2, "C":3 }
>   Index(Map<String, Integer> index) {
>     this.index = index;
>   }
>
>   String[] possible() {
>     return index.keySet().toArray(new String[index.size()]);
>   }
>
>   int lookup(String value, int after) {
>     Integer maybe = index.get(value);
>     if (maybe != null && maybe > after) {
>       return maybe;
>     }
>     return -1;
>   }
> }
>
> Am I missing a better solution?
>
> If you reply to this email, your message will be added to the discussion below:
> http://users.parboiled.org/Best-way-to-have-sequence-of-optional-tokens-with-separator-inbetween-tp4024037.html
> To start a new topic under parboiled users, email [hidden email]
> To unsubscribe from parboiled users, click here.
> NAML

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Best way to have sequence of optional tokens with separator inbetween?

slow
That is exactly what I was looking for.  Thanks.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Best way to have sequence of optional tokens with separator inbetween?

slow
I have another best practice question.

I now need to reuse the previous method, but with a minimum size in some contexts.
I got this working with:

  Rule SequenceMinNumberOfOptionalsSeparatedBy(
      int needAtLeast, String separator, Object... rules) {
    Rule curr = toRule(Sequence(rules[0], ACTION(increment())));
    for (int i = 1, s = rules.length; i < s; ++i) {
      Object next = rules[i];
      curr = FirstOf(
          Sequence(curr, Optional(separator, next, ACTION(increment()))),
          Sequence(next, ACTION(increment())));
    }
    return Sequence(push(0), curr, popAsInt() >= needAtLeast);
  }

  boolean increment() {
    return push(popAsInt() + 1);
  }

  int popAsInt() {
    return (int) pop();
  }

However, this seems like an abuse of the parsing layer to me.  Is it better to do these type of validation checks during AST creation/traversal?
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Best way to have sequence of optional tokens with separator inbetween?

mathias
Administrator
Yes, usually the parsing layer should be used purely for syntactical analysis.
All the things that are more semantical in nature should usually be taken on by a subsequent stage in your processing pipeline operating on ASTs.

Cheers,
Mathias

---
[hidden email]
http://www.parboiled.org

On 19.07.2012, at 05:35, slow [via parboiled users] wrote:

> However, this seems like an abuse of the parsing layer to me.  Is it better to do these type of validation checks during AST creation/traversal?

Loading...