What's more meta than using qapi to define qapi? :)
Convert QType into a full-fledged[*] builtin qapi enum type, so
that a subsequent patch can then use it as the discriminator
type of qapi alternate types. Fortunately, the judicious use of
'prefix' in the qapi definition avoids churn to the spelling of
the enum constants.
To avoid circular definitions, we have to flip the order of
inclusion between "qobject.h" vs. "qapi-types.h". Back in commit
28770e0, we had the latter include the former, so that we could
use 'QObject *' for our implementation of 'any'. But that usage
also works with only a forward declaration, whereas the
definition of QObject requires QType to be a complete type.
[*] The type has to be builtin, rather than declared in
qapi/common.json, because we want to use it for alternates even
when common.json is not included. But since it is the first
builtin enum type, we have to add special cases to qapi-types
and qapi-visit to only emit definitions once, even when two
qapi files are being compiled into the same binary (the way we
already handled builtin list types like 'intList'). We may
need to revisit how multiple qapi files share common types,
but that's a project for another day.
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1449033659-25497-4-git-send-email-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
The name QType matches our CODING_STYLE conventions for type names
in CamelCase. It also matches the fact that we are already naming
all the enum members with a prefix of QTYPE, not QTYPE_CODE. And
doing the rename will also make it easier for the next patch to use
QAPI for providing the enum, which also wants CamelCase type names.
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1449033659-25497-3-git-send-email-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
The QObject hierarchy is small enough, and unlikely to grow further
(since we only use it to map to JSON and already cover all JSON
types), that we can simplify things by not tracking a separate
vtable, but just inline the code element of the vtable QType
directly into QObject (renamed to type), and track a separate array
of destroy functions. We can drop qnull_destroy_obj() in the
process.
The remaining QObject subclasses must export their destructor.
This also has the nice benefit of moving the typename 'QType'
out of the way, so that the next patch can repurpose it for a
nicer name for 'qtype_code'.
The various objects are still the same size (so no change in cache
line pressure), but now have less indirection (although I didn't
bother benchmarking to see if there is a noticeable speedup, as
we don't have hard evidence that this was in a performance hotspot
in the first place).
A future patch could drop the refcnt size to 32 bits for a smaller
struct on 64-bit architectures, if desired (we have limits on the
largest JSON that we are willing to parse, and will probably never
need to take full advantage of a 64-bit refcnt).
Suggested-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1449033659-25497-2-git-send-email-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Commit 29c75dd "json-streamer: limit the maximum recursion depth and
maximum token count" attempts to guard against excessive heap usage by
limiting total token size (it says "token count", but that's a lie).
Total token size is a rather imprecise predictor of heap usage: many
small tokens use more space than few large tokens with the same input
size, because there's a constant per-token overhead: 37 bytes on my
system.
Tighten this up: limit the token count to 2Mi. Chosen to roughly
match the 64MiB total token size limit.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <1448486613-17634-13-git-send-email-armbru@redhat.com>
Replace the contents of the tokens GQueue with a simple struct. This cuts
the amount of memory allocated by tests/check-qjson from ~500MB to ~20MB,
and the execution time from 600ms to 80ms on my laptop. Still a lot (some
could be saved by using an intrusive list, such as QSIMPLEQ, instead of
the GQueue), but the savings are already massive and the right thing to
do would probably be to get rid of json-streamer completely.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <1448300659-23559-5-git-send-email-pbonzini@redhat.com>
[Straightforwardly rebased on my patches]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Even though we still have the "streamer" concept, the tokens can now
be deleted as they are read. While doing so convert from QList to
GQueue, since the next step will make tokens not a QObject and we
will have to do the conversion anyway.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <1448300659-23559-4-git-send-email-pbonzini@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
We backtrack in parse_value(), even though JSON is LL(1) and thus can
be parsed by straightforward recursive descent. Do exactly that.
Based on an almost-correct patch from Paolo Bonzini.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1448486613-17634-10-git-send-email-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
JSONLexer only needs a simple resizable buffer. json-streamer.c
can allocate memory for each token instead of relying on reference
counting of QStrings.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <1448300659-23559-2-git-send-email-pbonzini@redhat.com>
[Straightforwardly rebased on my patches, checkpatch made happy]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1448486613-17634-8-git-send-email-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1448486613-17634-7-git-send-email-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Simplifies things, because we always check for a specific one.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1448486613-17634-6-git-send-email-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1448486613-17634-5-git-send-email-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
We limit nesting depth and input size to defend against input
triggering excessive heap or stack memory use (commit 29c75dd
json-streamer: limit the maximum recursion depth and maximum token
count). However, when the nesting limit is exceeded,
parser_context_peek_token()'s assertion fails.
Broken in commit 65c0f1e "json-parser: don't replicate tokens at each
level of recursion".
To reproduce stuff 1025 open braces or brackets into QMP.
Fix by taking the error exit instead of the normal one.
Reported-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <1448486613-17634-3-git-send-email-armbru@redhat.com>
The nesting limit from commit 29c75dd "json-streamer: limit the
maximum recursion depth and maximum token count" applies separately to
braces and brackets. This makes no sense. Apply it to their sum,
because that's actually a measure of recursion depth.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <1448486613-17634-2-git-send-email-armbru@redhat.com>
qobject_to_qstring() crashes on null, which is a trap for the unwary.
Return null instead, and simplify a few callers.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1444918537-18107-7-git-send-email-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Luiz Capitulino <lcapitulino@redhat.com>
qobject_to_qlist() crashes on null, which is a trap for the unwary.
Return null instead.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1444918537-18107-6-git-send-email-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Luiz Capitulino <lcapitulino@redhat.com>
qobject_to_qfloat() and qobject_to_qint() crash on null, which is a
trap for the unwary. Return null instead, and simplify a few callers.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1444918537-18107-5-git-send-email-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Luiz Capitulino <lcapitulino@redhat.com>
qobject_to_qdict() crashes on null, which is a trap for the unwary.
Return null instead, and simplify a few callers.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1444918537-18107-4-git-send-email-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Luiz Capitulino <lcapitulino@redhat.com>
qobject_to_qbool() crashes on null, which is a trap for the unwary.
Return null instead, and simplify a few callers.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1444918537-18107-3-git-send-email-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Luiz Capitulino <lcapitulino@redhat.com>
In particular, don't include it into headers.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Luiz Capitulino <lcapitulino@redhat.com>
Remove it except for two things in qerror.h:
* Two #include to be cleaned up separately to avoid cluttering this
patch.
* The QERR_ macros. Mark as obsolete.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Luiz Capitulino <lcapitulino@redhat.com>
Now that qbool is fixed, let's fix getting and setting a bool
value to a qdict member to also use C99 bool rather than int.
I audited all callers to ensure that the changed return type
will not cause any changed semantics.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Acked-by: Luiz Capitulino <lcapitulino@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
We require a C99 compiler, so let's use 'bool' instead of 'int'
when dealing with boolean values. There are few enough clients
to fix them all in one pass.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Andreas Färber <afaerber@suse.de>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Acked-by: Luiz Capitulino <lcapitulino@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
In the block layer functions that determine options for a child block
device, it's a common pattern to either copy options from the parent's
options or to set a default string if the option isn't explicitly set
yet for the child. Provide convenience functions so that it becomes a
one-liner for each option.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
This counts the entries in a flattened array in a QDict without
actually splitting the QDict into a QList.
bdrv_open_image() doesn't take a QList, but rather a QDict and a key
prefix string, so this is more convenient for block drivers which have a
dynamically sized list of child nodes (e.g. Quorum) and are to be
converted to using bdrv_open_image() as the standard interface for
opening child nodes.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
We document that in QMP, the client may send any json-value
for the optional "id" key, and then return that same value
on reply (both success and failures, insofar as the failure
happened after parsing the id). [Note that the output may
not be identical to the input, as whitespace may change and
since we may reorder keys within a json-object, but that this
still constitutes the same json-value]. However, we were not
handling the JSON literal null, which counts as a json-value
per RFC 7159.
Also, down the road, given the QAPI schema of {'*foo':'str'} or
{'*foo':'ComplexType'}, we could decide to allow the QMP client
to pass { "foo":null } instead of the current representation of
{ } where omitting the key is the only way to get at the default
NULL value. Such a change might be useful for argument
introspection (if a type in older qemu lacks 'foo' altogether,
then an explicit "foo":null probe will force an easily
distinguished error message for whether the optional "foo" key
is even understood in newer qemu). And if we add default values
to optional arguments, allowing an explicit null would be
required for getting a NULL value associated with an optional
string that has a non-null default. But all that can come at a
later day.
The 'check-unit' testsuite is enhanced to test that parsing
produces the same object as explicitly requesting a reference
to the special qnull object. In addition, I tested with:
$ ./x86_64-softmmu/qemu-system-x86_64 -qmp stdio -nodefaults
{"QMP": {"version": {"qemu": {"micro": 91, "minor": 2, "major": 2}, "package": ""}, "capabilities": []}}
{"execute":"qmp_capabilities","id":null}
{"return": {}, "id": null}
{"id":{"a":null,"b":[1,null]},"execute":"quit"}
{"return": {}, "id": {"a": null, "b": [1, null]}}
{"timestamp": {"seconds": 1427742379, "microseconds": 423128}, "event": "SHUTDOWN"}
Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
I'm going to fix the JSON parser to recognize null. The obvious
representation of JSON null as (QObject *)NULL doesn't work, because
the parser already uses it as an error value. Perhaps we should
change it to free NULL for null, but that's more than I can do right
now. Create a special null QObject instead.
The existing QDict, QList, and QString all represent something that
is a pointer in C and could therefore be associated with NULL. But
right now, all three of these sub-types are always non-null once
created, so the new null sentinel object is intentionally unrelated
to them.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
QTYPE_NONE is a sentinel value. No QObject has this type code.
Document it properly.
Fix dump_qobject() to abort() on QTYPE_NONE, just like for any other
invalid type code.
Fix to_json() to abort() on all invalid type codes, not just
QTYPE_MAX.
Clean up Property member qtype's type: it's a qtype_code.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
For the pretty formatting, the functions converting QDicts and QLists to
JSON should not print a space after the comma separating objects,
because a newline will emitted immediately afterwards, making the
whitespace superfluous.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This made the lexer wait for a closing *double* quote.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Amos Kong <akong@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
This function joins two QDicts by absorbing one into the other.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Just hardcode them in the callers
Cc: Luiz Capitulino <lcapitulino@redhat.com>
Cc: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Cole Robinson <crobinso@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
Currently, qdict_array_split() only splits off entries with a key prefix
of "%u.", packing them into a new QDict. This patch makes it support
entries with the plain key "%u" as well, directly putting them into the
new QList without creating a QDict.
If there is both an entry with a key of "%u" and other entries with keys
prefixed "%u." (for the same index), the function simply terminates.
To do this, this patch also adds a static function which tests whether a
given QDict contains any keys with the given prefix. This is used to test
whether entries with a key prefixed "%u." do exist in the source QDict
without modifying it.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reversing qdict_array_split(), qdict_flatten() should flatten QLists as
well by interpreting them as QDicts where every entry's key is its
index.
This allows bringing QDicts with QLists from QMP commands to the same
form as they would be given as command-line options, thereby allowing
them to be parsed the same way.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This function splits a QDict consisting of entries prefixed by
incrementally enumerated indices into a QList of QDicts.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This is no longer needed, and is obsoleted by error_abort. Remove.
Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.15 (GNU/Linux)
iQIcBAABAgAGBQJSmMQmAAoJEH8JsnLIjy/WRmkP/1t239Y2Tjk+/wyR//W2YyD1
hiCnQdDb6SLamP6EnnA+jMik1vv9dRtZZ+eKYIIuEfRK8lv+AG1l0B+loBg6VTm2
3elpeo+MThEj/+ryMXlBky2NWS6t6k6ovEzJw11qaAvkJvXmetZV0I+wBzUj4DbT
oDAoqmWlsFkN5dpHRlUseqscab/FvAb+KCQqKNrk9441Bi3Le4bq5iyfaioObqU1
CtnF5facaPcKIpTdJdhk6Y2EWnhy1JlSggKroordqX1/3TE9NYpYG/7ejNmfqkLn
OoKPasZKF2iVAOBfwsODI1PFisKT48ceHEddSSqjcf64HitWeCYJAc8qSO7hggD6
TOzF01d5NLN+Ox+12xFUyeVI42N7Up2JP86oCXPtBC8T/ij6H81QNuKtzUmRmHeF
cUJCU0pm45lwDDWjueUjfyb483yqFsV31ddZTXEahN8leAeimR4MNz5iFzINsGN1
c2v/Wht5+chuDlgExFcOK1W3AUEZXblH0nv5jc8caEmX7l6MTPGYUBNOjPs9MhwZ
r3cDvUjIe9j6LR49N5iAekw+roW3cBD4NlobyvGXZNMvNtdVcGezvLOwFzPiT7Fk
rGZteBb1Bo68Wa0P9X7tdw7JHrUAP8yi/fE3MFVkVnofq644/DYeRdlznGqbo2Xw
0LWz6ViBQeLW1vm4LSMp
=uEdP
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'kwolf/tags/for-anthony' into staging
Block patches for 2.0 (flushing block-next)
# gpg: Signature made Fri 29 Nov 2013 08:43:18 AM PST using RSA key ID C88F2FD6
# gpg: Can't check signature: public key not found
# By Peter Lieven (17) and others
# Via Kevin Wolf
* kwolf/tags/for-anthony: (41 commits)
qemu-iotests: Add sample image and test for VMDK version 3
vmdk: Allow read only open of VMDK version 3
qemu-iotests: Filter out 'qemu-io> ' prompt
qemu-iotests: Filter qemu-io output in 025
block: Use BDRV_O_NO_BACKING where appropriate
qemu-iotests: Test snapshot mode
block: Enable BDRV_O_SNAPSHOT with driver-specific options
qemu-iotests: Make test case 030, 040 and 055 deterministic
qemu-iotest: Add pause_drive and resume_drive methods
blkdebug: add "remove_break" command
qemu-iotests: Drop local version of cancel_and_wait from 040
sheepdog: support user-defined redundancy option
sheepdog: refactor do_sd_create()
qdict: Optimise qdict_do_flatten()
qdict: Fix memory leak in qdict_do_flatten()
MAINTAINERS: add sheepdog development mailing list
COW: Extend checking allocated bits to beyond one sector
COW: Speed up writes
qapi: Change BlockDirtyInfo to list
block: per caller dirty bitmap
...
Message-id: 1385743555-27888-1-git-send-email-kwolf@redhat.com
Signed-off-by: Anthony Liguori <aliguori@amazon.com>
gcc 4.8.2 reports this warning when extra warnings are enabled (-Wextra):
CC qobject/qerror.o
qobject/qerror.c: In function ‘qerror_from_info’:
qobject/qerror.c:53:5: error:
function might be possible candidate for ‘gnu_printf’ format attribute [-Werror=suggest-attribute=format]
qerr->err_msg = g_strdup_vprintf(fmt, *va);
^
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Nested QDicts used to be both entered recursively in order to move their
entries to the target QDict and also be moved themselves to the target
QDict like all other objects. This is harmless because for the top
level, qdict_do_flatten() will encounter the (now empty) QDict for a
second time and then delete it, but at the same time it's obviously
unnecessary overhead. Just delete nested QDicts directly after moving
all of their entries.
Reported-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
qdict_flatten(): For each nested QDict with key x, all fields with key y
are moved to this QDict and their key is renamed to "x.y". This operation
is applied recursively for nested QDicts.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
The discriminator for anonymous unions is the data type. This allows to
have a union type that allows both of these:
{ 'file': 'my_existing_block_device_id' }
{ 'file': { 'filename': '/tmp/mydisk.qcow2', 'read-only': true } }
Unions like this are specified in the schema with an empty dict as
discriminator. For this example you could take:
{ 'union': 'BlockRef',
'discriminator': {},
'data': { 'definition': 'BlockOptions',
'reference': 'str' } }
{ 'type': 'ExampleObject',
'data: { 'file': 'BlockRef' } }
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Currently our JSON parser assumes that numbers lacking a fractional
value are integers and attempts to store them as QInt/int64 values. This
breaks in the case where the number overflows/underflows int64 values (which
is still valid JSON)
Fix this by detecting such cases and using a QFloat to store the value
instead.
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Amos Kong <akong@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
Known bugs in to_json():
* A start byte for a three-byte sequence followed by less than two
continuation bytes is split into one-byte sequences.
* Start bytes for sequences longer than three bytes get misinterpreted
as start bytes for three-byte sequences. Continuation bytes beyond
byte three become one-byte sequences.
This means all characters outside the BMP are decoded incorrectly.
* One-byte sequences with the MSB are put into the JSON string
verbatim when char is unsigned, producing invalid UTF-8. When char
is signed, they're replaced by "\\uFFFF" instead.
This includes \xFE, \xFF, and stray continuation bytes.
* Overlong sequences are happily accepted, unless screwed up by the
bugs above.
* Likewise, sequences encoding surrogate code points or noncharacters.
* Unlike other control characters, ASCII DEL is not escaped. Except
in overlong encodings.
My rewrite fixes them as follows:
* Malformed UTF-8 sequences are replaced.
Except the overlong encoding \xC0\x80 of U+0000 is still accepted.
Permits embedding NUL characters in C strings. This trick is known
as "Modified UTF-8".
* Sequences encoding code points beyond Unicode range are replaced.
* Sequences encoding code points beyond the BMP produce a surrogate
pair.
* Sequences encoding surrogate code points are replaced.
* Sequences encoding noncharacters are replaced.
* ASCII DEL is now always escaped.
The replacement character is U+FFFD.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>