Quantcast

Terminate a running parser

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Terminate a running parser

Hernâni
Hello all. We're using parboiled through the pegdown markdown processor. All is nice and we're very pleased, however the source markdown its user input which we don't control, and not few times we have parsers running for ages. So basically we are now running the parsers through a ThreadPoolExecutor with a timeout. The problem is that when we receive the TimeoutException we have no way to interrupt the running task, at least that we're aware of. For obvious reasons we can't use Thread.stop() or .destroy()

We don't really mind about editing the code to our needs, so I was thinking about editing some often run method to check the interrupted() flag on the current thread and throw some exception to cause the parser to terminate.

So my question is, does anybody have a suggestion of some method where we can put that? Or better yet, is there any built-in way to do this?

Thanks
-Hernâni
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Terminate a running parser

mathias
Administrator
Hernâni,

I'm glad that you like pegdown + parboiled.
However, the pegdown parser not terminating certainly sounds like a problem that needs to be addressed!
pegdown already had trouble with certain pathological input for a few times in the past, so it might well be that there are still one ore more of these problems present. Are you able to get a hold of the markdown source creating the unusually long parser running times?

Concerning your question re a good place for putting an interrupted flag check:
I'd probably try to stay at the pegdown level with this, at least initially. You could insert your check into a very basic parser rule (e.g. `Letter`) that is likely to take part in any pathological "loop".

Cheers,
Mathias

---
[hidden email]
http://www.parboiled.org

On 10.02.2012, at 21:11, Hernâni [via parboiled users] wrote:

> Hello all. We're using parboiled through the pegdown markdown processor. All is nice and we're very pleased, however the source markdown its user input which we don't control, and not few times we have parsers running for ages. So basically we are now running the parsers through a ThreadPoolExecutor with a timeout. The problem is that when we receive the TimeoutException we have no way to interrupt the running task, at least that we're aware of. For obvious reasons we can't use Thread.stop() or .destroy()
>
> We don't really mind about editing the code to our needs, so I was thinking about editing some often run method to check the interrupted() flag on the current thread and throw some exception to cause the parser to terminate.
>
> So my question is, does anybody have a suggestion of some method where we can put that? Or better yet, is there any built-in way to do this?
>
> Thanks
> -Hernâni
>
> If you reply to this email, your message will be added to the discussion below:
> http://users.parboiled.org/Terminate-a-running-parser-tp3733554p3733554.html
> To start a new topic under parboiled users, email [hidden email]
> To unsubscribe from parboiled users, click here.
> NAML

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Terminate a running parser

Hernâni
Hello Mathias, thanks for the answer.

I don't have any sample here with me, and we susally tend to fix it, but I can tell you that most of the times it happens with embedded source code that the users don't indent. Usually a lot of it :) I've seen someone trying to post 300k worth of unindented xml code.But I see it going wild a lot with c# code as well. Also, most of the times I believe it eventually finishes, our problem is that eventually the server load is just crazy and we have to restart everything. But the next time I see a post just looping and not returning, I'll send it to you (needs to be in a private message though).

In any case, I'm happing with terminating the rendering process after a certain time, I'll try your solution and post here the result.

Thanks for your help, cheers,
-Hernâni
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Terminate a running parser

mathias
Administrator
Hernâni,

yeah, I can see really large input being a problem all by itself.
I think ideally pegdown should be able to watch over itself and terminate after a certain maximum parsing period.
I've added an issue for this: https://github.com/sirthias/pegdown/issues/42

> Also, most of the times I believe it eventually finishes, our problem is that eventually the server load is just crazy and we have to restart everything. But the next time I see a post just looping and not returning, I'll send it to you (needs to be in a private message though).

Alright.
pegdown will never _really_ loop, it will always make progress.
This distinction can be academic in practice though, since it might exhibit an exponential runtime behavior in case of grammar bugs...

If you can distill a short example that runs really slowly there is usually a way to modify the grammar and remove the problem.

Cheers,
Mathias

---
[hidden email]
http://www.parboiled.org

On 11.02.2012, at 00:18, Hernâni [via parboiled users] wrote:

> Hello Mathias, thanks for the answer.
>
> I don't have any sample here with me, and we susally tend to fix it, but I can tell you that most of the times it happens with embedded source code that the users don't indent. Usually a lot of it :) I've seen someone trying to post 300k worth of unindented xml code.But I see it going wild a lot with c# code as well. Also, most of the times I believe it eventually finishes, our problem is that eventually the server load is just crazy and we have to restart everything. But the next time I see a post just looping and not returning, I'll send it to you (needs to be in a private message though).
>
> In any case, I'm happing with terminating the rendering process after a certain time, I'll try your solution and post here the result.
>
> Thanks for your help, cheers,
> -Hernâni
>
> If you reply to this email, your message will be added to the discussion below:
> http://users.parboiled.org/Terminate-a-running-parser-tp3733554p3733982.html
> To start a new topic under parboiled users, email [hidden email]
> To unsubscribe from parboiled users, click here.
> NAML

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Terminate a running parser

Hernâni
Thanks Mathias, your solution worked perfectly. And having a built-in way to timeout the parser would be awesome since my solution requires some extra threads. Is not really a problem since we already use the executor for other stuff. Here's a rough mockup of what I did, may be useful for others.

On org.pegdown.parser I just added the following bit at the top of the Letter rule:

        if (Thread.interrupted()) {
            throw new RuntimeException("stop");
        }

Then on my code I have some helper methods, a simplified static version of it:

    private static final ExecutorService THREAD_POOL
            = Executors.newCachedThreadPool();

    private static <T> T timedCall(Callable<T> c, long timeout, TimeUnit timeUnit)
            throws InterruptedException, ExecutionException, RejectedExecutionException, TimeoutException
    {
        FutureTask<T> task = new FutureTask<T>(c);
        THREAD_POOL.execute(task);

        try {
            return task.get(timeout, timeUnit);
        } catch (TimeoutException e) {
            task.cancel(true);
            throw e;
        }
    }

Which can be called like this (exception handling code omitted):

        String html = timedCall(new Callable<String>() {
            public String call() throws Exception {
                //code to render the markdown
            }
        }, 10, TimeUnit.SECONDS);


Btw, I since I had the hands in the mud decided to update both parboiled and pegdown, and I was impressed by the speed of it.  On my simplistic tests I was using a timeout of 750ms for this file, so I could see it actually timeout, and after the upgrade, I had to reduce that to 250ms otherwise I can't see the timeout. Nice work on this :)

Cheers,
-Hernâni
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Terminate a running parser

mathias
Administrator
Hernâni,

cool, glad it works for you.
And thanks for sharing your setup!

Cheers,
Mathias

---
[hidden email]
http://www.parboiled.org

On 11.02.2012, at 16:34, Hernâni [via parboiled users] wrote:

> Thanks Mathias, your solution worked perfectly. And having a built-in way to timeout the parser would be awesome since my solution requires some extra threads. Is not really a problem since we already use the executor for other stuff. Here's a rough mockup of what I did, may be useful for others.
>
> On org.pegdown.parser I just added the following bit at the top of the Letter rule:
>
>        if (Thread.interrupted()) {
>            throw new RuntimeException("stop");
>        }
>
> Then on my code I have some helper methods, a simplified static version of it:
>
>    private static final ExecutorService THREAD_POOL
>            = Executors.newCachedThreadPool();
>
>    private static <T> T timedCall(Callable<T> c, long timeout, TimeUnit timeUnit)
>            throws InterruptedException, ExecutionException, RejectedExecutionException, TimeoutException
>    {
>        FutureTask<T> task = new FutureTask<T>(c);
>        THREAD_POOL.execute(task);
>
>        try {
>            return task.get(timeout, timeUnit);
>        } catch (TimeoutException e) {
>            task.cancel(true);
>            throw e;
>        }
>    }
>
> Which can be called like this (exception handling code omitted):
>
>        String html = timedCall(new Callable<String>() {
>            public String call() throws Exception {
>                //code to render the markdown
>            }
>        }, 10, TimeUnit.SECONDS);
>
>
> Btw, I since I had the hands in the mud decided to update both parboiled and pegdown, and I was impressed by the speed of it.  On my simplistic tests I was using a timeout of 750ms for this file, so I could see it actually timeout, and after the upgrade, I had to reduce that to 250ms otherwise I can't see the timeout. Nice work on this :)
>
> Cheers,
> -Hernâni
>
>
> If you reply to this email, your message will be added to the discussion below:
> http://users.parboiled.org/Terminate-a-running-parser-tp3733554p3735193.html
> To start a new topic under parboiled users, email [hidden email]
> To unsubscribe from parboiled users, click here.
> NAML

Loading...