qemu-e2k

Commit Graph

Author	SHA1	Message	Date
Markus Armbruster	62815d85ae	json: Redesign the callback to consume JSON values The classical way to structure parser and lexer is to have the client call the parser to get an abstract syntax tree, the parser call the lexer to get the next token, and the lexer call some function to get input characters. Another way to structure them would be to have the client feed characters to the lexer, the lexer feed tokens to the parser, and the parser feed abstract syntax trees to some callback provided by the client. This way is more easily integrated into an event loop that dispatches input characters as they arrive. Our JSON parser is kind of between the two. The lexer feeds tokens to a "streamer" instead of a real parser. The streamer accumulates tokens until it got the sequence of tokens that comprise a single JSON value (it counts curly braces and square brackets to decide). It feeds those token sequences to a callback provided by the client. The callback passes each token sequence to the parser, and gets back an abstract syntax tree. I figure it was done that way to make a straightforward recursive descent parser possible. "Get next token" becomes "pop the first token off the token sequence". Drawback: we need to store a complete token sequence. Each token eats 13 + input characters + malloc overhead bytes. Observations: 1. This is not the only way to use recursive descent. If we replaced "get next token" by a coroutine yield, we could do without a streamer. 2. The lexer reports errors by passing a JSON_ERROR token to the streamer. This communicates the offending input characters and their location, but no more. 3. The streamer reports errors by passing a null token sequence to the callback. The (already poor) lexical error information is thrown away. 4. Having the callback receive a token sequence duplicates the code to convert token sequence to abstract syntax tree in every callback. 5. Known bug: the streamer silently drops incomplete token sequences. This commit rectifies 4. by lifting the call of the parser from the callbacks into the streamer. Later commits will address 3. and 5. The lifting removes a bug from qjson.c's parse_json(): it passed a pointer to a non-null Error * in certain cases, as demonstrated by check-qjson.c. json_parser_parse() is now unused. It's a stupid wrapper around json_parser_parse_err(). Drop it, and rename json_parser_parse_err() to json_parser_parse(). Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20180823164025.12553-35-armbru@redhat.com>	2018-08-24 20:26:37 +02:00
Markus Armbruster	037f244088	json: Have lexer call streamer directly json_lexer_init() takes the function to process a token as an argument. It's always json_message_process_token(). Makes the code harder to understand for no actual gain. Drop the indirection. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20180823164025.12553-34-armbru@redhat.com>	2018-08-24 20:26:37 +02:00
Marc-André Lureau	7c1e1d5481	json: remove useless return value from lexer/parser The lexer always returns 0 when char feeding. Furthermore, none of the caller care about the return value. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20180326150916.9602-10-marcandre.lureau@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20180823164025.12553-32-armbru@redhat.com>	2018-08-24 20:26:37 +02:00
Paolo Bonzini	a942d8fa01	json-streamer: fix double-free on exiting during a parse Now that json-streamer tries not to leak tokens on incomplete parse, the tokens can be freed twice if QEMU destroys the json-streamer object during the parser->emit call. To fix this, create the new empty GQueue earlier, so that it is already in place when the old one is passed to parser->emit. Reported-by: Changlong Xie <xiecl.fnst@cn.fujitsu.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <1467636059-12557-1-git-send-email-pbonzini@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2016-07-12 18:31:27 +02:00
Eric Blake	ba4dba5434	json-streamer: Don't leak tokens on incomplete parse Valgrind complained about a number of leaks in tests/check-qobject-json: ==12657== definitely lost: 17,247 bytes in 1,234 blocks All of which had the same root cause: on an incomplete parse, we were abandoning the token queue without cleaning up the allocated data within each queue element. Introduced in commit `95385fe`, when we switched from QList (which recursively frees contents) to g_queue (which does not). We don't yet require glib 2.32 with its g_queue_free_full(), so open-code it instead. CC: qemu-stable@nongnu.org Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <1463608012-12760-1-git-send-email-eblake@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com>	2016-06-30 15:24:36 +02:00
Peter Maydell	f2ad72b30e	qobject: Clean up includes Clean up includes so that osdep.h is included first and headers which it implies are not included manually. This commit was created with scripts/clean-includes. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Eric Blake <eblake@redhat.com> Message-id: 1454089805-5470-12-git-send-email-peter.maydell@linaro.org	2016-02-04 17:41:30 +00:00
Markus Armbruster	df649835fe	qjson: Limit number of tokens in addition to total size Commit `29c75dd` "json-streamer: limit the maximum recursion depth and maximum token count" attempts to guard against excessive heap usage by limiting total token size (it says "token count", but that's a lie). Total token size is a rather imprecise predictor of heap usage: many small tokens use more space than few large tokens with the same input size, because there's a constant per-token overhead: 37 bytes on my system. Tighten this up: limit the token count to 2Mi. Chosen to roughly match the 64MiB total token size limit. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <1448486613-17634-13-git-send-email-armbru@redhat.com>	2015-11-26 10:07:07 +01:00
Paolo Bonzini	9bada89711	qjson: surprise, allocating 6 QObjects per token is expensive Replace the contents of the tokens GQueue with a simple struct. This cuts the amount of memory allocated by tests/check-qjson from ~500MB to ~20MB, and the execution time from 600ms to 80ms on my laptop. Still a lot (some could be saved by using an intrusive list, such as QSIMPLEQ, instead of the GQueue), but the savings are already massive and the right thing to do would probably be to get rid of json-streamer completely. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <1448300659-23559-5-git-send-email-pbonzini@redhat.com> [Straightforwardly rebased on my patches] Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>	2015-11-26 10:07:07 +01:00
Paolo Bonzini	95385fe9ac	qjson: store tokens in a GQueue Even though we still have the "streamer" concept, the tokens can now be deleted as they are read. While doing so convert from QList to GQueue, since the next step will make tokens not a QObject and we will have to do the conversion anyway. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <1448300659-23559-4-git-send-email-pbonzini@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>	2015-11-26 10:07:07 +01:00
Paolo Bonzini	d2ca7c0b0d	qjson: replace QString in JSONLexer with GString JSONLexer only needs a simple resizable buffer. json-streamer.c can allocate memory for each token instead of relying on reference counting of QStrings. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <1448300659-23559-2-git-send-email-pbonzini@redhat.com> [Straightforwardly rebased on my patches, checkpatch made happy] Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>	2015-11-26 09:31:22 +01:00
Markus Armbruster	c54616608a	qjson: Give each of the six structural chars its own token type Simplifies things, because we always check for a specific one. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <1448486613-17634-6-git-send-email-armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>	2015-11-26 09:22:54 +01:00
Markus Armbruster	0753113a26	qjson: Don't crash when input exceeds nesting limit We limit nesting depth and input size to defend against input triggering excessive heap or stack memory use (commit `29c75dd` json-streamer: limit the maximum recursion depth and maximum token count). However, when the nesting limit is exceeded, parser_context_peek_token()'s assertion fails. Broken in commit `65c0f1e` "json-parser: don't replicate tokens at each level of recursion". To reproduce stuff 1025 open braces or brackets into QMP. Fix by taking the error exit instead of the normal one. Reported-by: Eric Blake <eblake@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <1448486613-17634-3-git-send-email-armbru@redhat.com>	2015-11-26 09:18:04 +01:00
Markus Armbruster	4f2d31fbc0	qjson: Apply nesting limit more sanely The nesting limit from commit `29c75dd` "json-streamer: limit the maximum recursion depth and maximum token count" applies separately to braces and brackets. This makes no sense. Apply it to their sum, because that's actually a measure of recursion depth. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <1448486613-17634-2-git-send-email-armbru@redhat.com>	2015-11-26 09:17:57 +01:00
Paolo Bonzini	a372823a14	build: move qobject files to qobject/ and libqemuutil.a Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2013-01-12 18:42:50 +01:00

14 Commits