12 Commits

Author SHA1 Message Date
Michael Roth
b011f61931 json-lexer: make lexer error-recovery more deterministic
Currently when we reach an error state we effectively flush everything
fed to the lexer, which can put us in a state where we keep feeding
tokens into the parser at arbitrary offsets in the stream. This makes it
difficult for the lexer/tokenizer/parser to get back in sync when bad
input is made by the client.

With these changes we emit an error state/token up to the tokenizer as
soon as we reach an error state, and continue processing any data passed
in rather than bailing out. The reset token will be used to reset the
tokenizer and parser, such that they'll recover state as soon as the
lexer begins generating valid token sequences again.

We also map chr(192,193,245-255) to an error state here, since they are
invalid UTF-8 characters. QMP guest proxy/agent will use chr(255) to
force a flush/reset of previous input for reliable delivery of certain
events, so also we document that thoroughly here.

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2011-06-07 13:52:11 -05:00
Michael Roth
bd3924a33a json-lexer: fix flushing logic to not always go to error state
Currently we flush the lexer by passing in a NULL character. This
generally forces the lexer to go to the corresponding TERMINAL() state
for whatever token type it is currently parsing, emits the token to the
parser, then puts the lexer back into IN_START state. However, since a
NULL character causes char_consumed to be 0, we always do a second pass
after this, which puts us in the IN_ERROR state. Fix this behavior by
adding a "flush" flag that tells the lexer not to do a more than 1
iteration.

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2011-06-07 13:52:11 -05:00
Anthony Liguori
529a0ef5f3 json-lexer: reset the lexer state on an invalid token
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2011-06-07 13:52:11 -05:00
Anthony Liguori
325601b47b json-lexer: limit the maximum size of a given token
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2011-06-07 13:52:11 -05:00
Blue Swirl
33d05394a6 json-lexer: fix conflict with mingw32 ERROR definition
The name ERROR is too generic, it conflicts with mingw32 ERROR definition.

Replace ERROR with IN_ERROR.

Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2011-04-15 18:25:38 +00:00
Paolo Bonzini
28e91a681a remove unnecessary lookaheads
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2010-06-11 15:25:14 -03:00
Paolo Bonzini
f7c052747e implement optional lookahead in json lexer
Not requiring one extra character when lookahead is not necessary
ensures that clients behave properly even if they, for example,
send QMP requests without a trailing newline.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2010-06-11 15:25:14 -03:00
Luiz Capitulino
ecb50f5fef json-lexer: Drop 'buf'
QString supports adding a single char, 'buf' is unneeded.

Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2010-06-11 15:25:14 -03:00
Luiz Capitulino
1041ba7a14 json-lexer: Handle missing escapes
The JSON escape sequence "\/" and "\\" are valid and should be
handled.

Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2010-06-11 15:25:14 -03:00
Luiz Capitulino
03308f6c27 json-lexer: Initialize 'x' and 'y'
The 'lexer' variable is passed by the caller, it can contain anything
(eg. garbage).

Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2010-06-11 15:25:14 -03:00
Roy Tam
2c0d4b36e7 json: fix PRId64 on Win32
OK we are fooled by the json lexer and parser. As we use %I64d to
print 'long long' variables in Win32, but lexer and parser only deal
with %lld but not %I64d, this patch add support for %I64d and solve
'info pci', 'powser_reset' and 'power_powerdown' assert failure in
Win32.

Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-02-10 12:47:58 -06:00
Anthony Liguori
5ab8558d9b Add a lexer for JSON
Our JSON parser is a three stage parser.  The first stage tokenizes the stream
into a set of lexical tokens.  Since the lexical grammar is regular, we can
use a finite state machine to model it.  The state machine will emit tokens
as they are identified.

Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-11-17 08:49:39 -06:00