qemu-e2k

OpenE2K/qemu-e2k

Fork 0

Commit Graph

Author	SHA1	Message	Date
Markus Armbruster	e2ec3f9768	qjson: to_json() case QTYPE_QSTRING is buggy, rewrite Known bugs in to_json(): * A start byte for a three-byte sequence followed by less than two continuation bytes is split into one-byte sequences. * Start bytes for sequences longer than three bytes get misinterpreted as start bytes for three-byte sequences. Continuation bytes beyond byte three become one-byte sequences. This means all characters outside the BMP are decoded incorrectly. * One-byte sequences with the MSB are put into the JSON string verbatim when char is unsigned, producing invalid UTF-8. When char is signed, they're replaced by "\\uFFFF" instead. This includes \xFE, \xFF, and stray continuation bytes. * Overlong sequences are happily accepted, unless screwed up by the bugs above. * Likewise, sequences encoding surrogate code points or noncharacters. * Unlike other control characters, ASCII DEL is not escaped. Except in overlong encodings. My rewrite fixes them as follows: * Malformed UTF-8 sequences are replaced. Except the overlong encoding \xC0\x80 of U+0000 is still accepted. Permits embedding NUL characters in C strings. This trick is known as "Modified UTF-8". * Sequences encoding code points beyond Unicode range are replaced. * Sequences encoding code points beyond the BMP produce a surrogate pair. * Sequences encoding surrogate code points are replaced. * Sequences encoding noncharacters are replaced. * ASCII DEL is now always escaped. The replacement character is U+FFFD. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Laszlo Ersek <lersek@redhat.com> Signed-off-by: Blue Swirl <blauwirbel@gmail.com>	2013-04-13 19:40:25 +00:00
Paolo Bonzini	a372823a14	build: move qobject files to qobject/ and libqemuutil.a Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2013-01-12 18:42:50 +01:00

Author

SHA1

Message

Date

Markus Armbruster

e2ec3f9768

qjson: to_json() case QTYPE_QSTRING is buggy, rewrite

Known bugs in to_json():

* A start byte for a three-byte sequence followed by less than two
  continuation bytes is split into one-byte sequences.

* Start bytes for sequences longer than three bytes get misinterpreted
  as start bytes for three-byte sequences.  Continuation bytes beyond
  byte three become one-byte sequences.

  This means all characters outside the BMP are decoded incorrectly.

* One-byte sequences with the MSB are put into the JSON string
  verbatim when char is unsigned, producing invalid UTF-8.  When char
  is signed, they're replaced by "\\uFFFF" instead.

  This includes \xFE, \xFF, and stray continuation bytes.

* Overlong sequences are happily accepted, unless screwed up by the
  bugs above.

* Likewise, sequences encoding surrogate code points or noncharacters.

* Unlike other control characters, ASCII DEL is not escaped.  Except
  in overlong encodings.

My rewrite fixes them as follows:

* Malformed UTF-8 sequences are replaced.

  Except the overlong encoding \xC0\x80 of U+0000 is still accepted.
  Permits embedding NUL characters in C strings.  This trick is known
  as "Modified UTF-8".

* Sequences encoding code points beyond Unicode range are replaced.

* Sequences encoding code points beyond the BMP produce a surrogate
  pair.

* Sequences encoding surrogate code points are replaced.

* Sequences encoding noncharacters are replaced.

* ASCII DEL is now always escaped.

The replacement character is U+FFFD.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>

2013-04-13 19:40:25 +00:00

Paolo Bonzini

a372823a14

build: move qobject files to qobject/ and libqemuutil.a

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

2013-01-12 18:42:50 +01:00

2 Commits