2014-09-01

antlr4-异常处理（八）

antlr4的异常处理机制很简单，下面我们来具体看看

修改和重定向错误

默认情况下， antlr发送所有的错误到标准错误中，但是我们可以提供ANTERErrorListener接口的实现来改变错误的内容和目的地。
这个接口有一个syntaxError方法适用于词法和解析器。方法接收所有已排序的本地错误和消息错误。它也可以接收一个parser的引用，所以我们能够查询状态的识别。

public static class UnderlineListener extends BaseErrorListener {
public void syntaxError(Recognizer<?, ?> recognizer,
            Object offendingSymbol,
            int line, int charPositionInLine,
            String msg,
            RecognitionException e)
{
    System.err.println("line "+line+":"+charPositionInLine+" "+msg);
    underlineError(recognizer,(Token)offendingSymbol,
                   line, charPositionInLine);
}

protected void underlineError(Recognizer recognizer,
                              Token offendingToken, int line,
                              int charPositionInLine) {
    CommonTokenStream tokens =
        (CommonTokenStream)recognizer.getInputStream();
    String input = tokens.getTokenSource().getInputStream().toString();
    String[] lines = input.split("\n");
    String errorLine = lines[line - 1];
    System.err.println(errorLine);
    for (int i=0; i<charPositionInLine; i++) System.err.print(" ");
    int start = offendingToken.getStartIndex();
    int stop = offendingToken.getStopIndex();
    if ( start>=0 && stop>=0 ) {
        for (int i=start; i<=stop; i++) System.err.print("^");
        }
    System.err.println();
    }
}

public static void main(String[] args) throws Exception {
    ANTLRInputStream input = new ANTLRInputStream(System.in);
    SimpleLexer lexer = new SimpleLexer(input);
    CommonTokenStream tokens = new CommonTokenStream(lexer);
    SimpleParser parser = new SimpleParser(tokens);
    parser.removeErrorListeners(); // remove ConsoleErrorListener
    parser.addErrorListener(new UnderlineListener());
    parser.prog();
}

自动错误恢复

错误恢复就是运行解析器在发现了一个词法错误后可以继续执行。原则上，最佳的错误恢复应该是我们手写的递归下沿的解析器。解析器执行单个token插入和单个token删除直到不符合的token错误被修正。如果没有，解析器吞掉tokens知道他发现一个token能够可理解的跟着当前的规则然后返回，继续直到所有的解析完。

捕获失败的语义预测

so easy! 当预测失败时，你可以使用这样的函数{…}?来执行某些动作。

vec4:   '[' ints[4] ']' ;

ints[int max]
locals [int i=1]
:   INT ( ',' {$i++;} {$i<=$max}?<fail={"exceeded max "+$max}> INT )*
;

INT :   [0-9]+ ;
WS  :   [ \t\r\n]+ -> skip ;

错误的替代选择

同样不解释了.

fcall
:   ID '(' expr ')'
|   ID '(' expr ')' ')' {notifyErrorListeners("Too many parentheses");}
|   ID '(' expr         {notifyErrorListeners("Missing closing ')'");}
;

另一个可选的错误策略

有时候我们不想用默认的错误处理策略。第一，在运行期的时候，我们可能想要禁用一些内部的错误处理。第二，我们可能想要提示最开始的语法错误。
要查看错误处理策略，我们可以看一下ANTLRErrorStrategy接口的默认具体实现DefaultErrorStrategy。这个类处理了antlr4所有的错误处理行为。
从antlr生成的代码中我们也可以看到：

_errHandler.reportError(this, re);
_errHandler.recover(this, re);

另一种错误处理就是BailErrorStrategy处理策略，即遇到异常后自动抛出。我们只需要告诉parser使用这种策略即可。当然，我们也可以用我们自己的错误处理方式，只需要继承DefaultErrorStrategy即可。

parser.setErrorHandler( new BailErrorStrategy());

public class BailErrorStrategy extends DefaultErrorStrategy {
/** Instead of recovering from exception {@code e}, re-throw it wrapped
 *  in a {@link ParseCancellationException} so it is not caught by the
 *  rule function catches.  Use {@link Exception#getCause()} to get the
 *  original {@link RecognitionException}.
 */
@Override
public void recover(Parser recognizer, RecognitionException e) {
    for (ParserRuleContext context = recognizer.getContext(); context != null; context = context.getParent()) {
        context.exception = e;
    }

    throw new ParseCancellationException(e);
}

/** Make sure we don't attempt to recover inline; if the parser
 *  successfully recovers, it won't throw an exception.
 */
@Override
public Token recoverInline(Parser recognizer)
    throws RecognitionException
{
    InputMismatchException e = new InputMismatchException(recognizer);
    for (ParserRuleContext context = recognizer.getContext(); context != null; context = context.getParent()) {
        context.exception = e;
    }

    throw new ParseCancellationException(e);
}

/** Make sure we don't attempt to recover from problems in subrules. */
@Override
public void sync(Parser recognizer) { }