qemu-e2k/scripts/qapi/parser.py

Ignoring revisions in .git-blame-ignore-revs. Click here to bypass and see the normal blame view.

816 lines
30 KiB
Python
Raw Normal View History

# -*- coding: utf-8 -*-
#
# QAPI schema parser
#
# Copyright IBM, Corp. 2011
# Copyright (c) 2013-2019 Red Hat Inc.
#
# Authors:
# Anthony Liguori <aliguori@us.ibm.com>
# Markus Armbruster <armbru@redhat.com>
# Marc-André Lureau <marcandre.lureau@redhat.com>
# Kevin Wolf <kwolf@redhat.com>
#
# This work is licensed under the terms of the GNU GPL, version 2.
# See the COPYING file in the top-level directory.
from collections import OrderedDict
import os
import re
from typing import (
qapi/parser: add import cycle workaround Adding static types causes a cycle in the QAPI generator: [schema -> expr -> parser -> schema]. It exists because the QAPIDoc class needs the names of types defined by the schema module, but the schema module needs to import both expr.py/parser.py to do its actual parsing. Ultimately, the layering violation is that parser.py should not have any knowledge of specifics of the Schema. QAPIDoc performs double-duty here both as a parser *and* as a finalized object that is part of the schema. In this patch, add the offending type hints alongside the workaround to avoid the cycle becoming a problem at runtime. See https://mypy.readthedocs.io/en/latest/runtime_troubles.html#import-cycles for more information on this workaround technique. I see three ultimate resolutions here: (1) Just keep this patch and use the TYPE_CHECKING trick to eliminate the cycle which is only present during static analysis. (2) Don't bother to annotate connect_member() et al, give them 'object' or 'Any'. I don't particularly like this, because it diminishes the usefulness of type hints for documentation purposes. Still, it's an extremely quick fix. (3) Reimplement doc <--> definition correlation directly in schema.py, integrating doc fields directly into QAPISchemaMember and relieving the QAPIDoc class of the responsibility. Users of the information would instead visit the members first and retrieve their documentation instead of the inverse operation -- visiting the documentation and retrieving their members. My preference is (3), but in the short-term (1) is the easiest way to have my cake (strong type hints) and eat it too (Not have import cycles). Do (1) for now, but plan for (3). Signed-off-by: John Snow <jsnow@redhat.com> Message-Id: <20210930205716.1148693-9-jsnow@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com>
2021-09-30 22:57:11 +02:00
TYPE_CHECKING,
Dict,
List,
Mapping,
Optional,
Set,
Union,
)
from .common import must_match
from .error import QAPISemError, QAPISourceError
qapi: Prefer explicit relative imports All of the QAPI include statements are changed to be package-aware, as explicit relative imports. A quirk of Python packages is that the name of the package exists only *outside* of the package. This means that to a module inside of the qapi folder, there is inherently no such thing as the "qapi" package. The reason these imports work is because the "qapi" package exists in the context of the caller -- the execution shim, where sys.path includes a directory that has a 'qapi' folder in it. When we write "from qapi import sibling", we are NOT referencing the folder 'qapi', but rather "any package named qapi in sys.path". If you should so happen to have a 'qapi' package in your path, it will use *that* package. When we write "from .sibling import foo", we always reference explicitly our sibling module; guaranteeing consistency in *where* we are importing these modules from. This can be useful when working with virtual environments and packages in development mode. In development mode, a package is installed as a series of symlinks that forwards to your same source files. The problem arises because code quality checkers will follow "import qapi.x" to the "installed" version instead of the sibling file and -- even though they are the same file -- they have different module paths, and this causes cyclic import problems, false positive type mismatch errors, and more. It can also be useful when dealing with hierarchical packages, e.g. if we allow qemu.core.qmp, qemu.qapi.parser, etc. Signed-off-by: John Snow <jsnow@redhat.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Reviewed-by: Cleber Rosa <crosa@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20201009161558.107041-6-jsnow@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com>
2020-10-09 18:15:27 +02:00
from .source import QAPISourceInfo
qapi/parser: add import cycle workaround Adding static types causes a cycle in the QAPI generator: [schema -> expr -> parser -> schema]. It exists because the QAPIDoc class needs the names of types defined by the schema module, but the schema module needs to import both expr.py/parser.py to do its actual parsing. Ultimately, the layering violation is that parser.py should not have any knowledge of specifics of the Schema. QAPIDoc performs double-duty here both as a parser *and* as a finalized object that is part of the schema. In this patch, add the offending type hints alongside the workaround to avoid the cycle becoming a problem at runtime. See https://mypy.readthedocs.io/en/latest/runtime_troubles.html#import-cycles for more information on this workaround technique. I see three ultimate resolutions here: (1) Just keep this patch and use the TYPE_CHECKING trick to eliminate the cycle which is only present during static analysis. (2) Don't bother to annotate connect_member() et al, give them 'object' or 'Any'. I don't particularly like this, because it diminishes the usefulness of type hints for documentation purposes. Still, it's an extremely quick fix. (3) Reimplement doc <--> definition correlation directly in schema.py, integrating doc fields directly into QAPISchemaMember and relieving the QAPIDoc class of the responsibility. Users of the information would instead visit the members first and retrieve their documentation instead of the inverse operation -- visiting the documentation and retrieving their members. My preference is (3), but in the short-term (1) is the easiest way to have my cake (strong type hints) and eat it too (Not have import cycles). Do (1) for now, but plan for (3). Signed-off-by: John Snow <jsnow@redhat.com> Message-Id: <20210930205716.1148693-9-jsnow@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com>
2021-09-30 22:57:11 +02:00
if TYPE_CHECKING:
# pylint: disable=cyclic-import
# TODO: Remove cycle. [schema -> expr -> parser -> schema]
from .schema import QAPISchemaFeature, QAPISchemaMember
# Return value alias for get_expr().
_ExprValue = Union[List[object], Dict[str, object], str, bool]
class QAPIExpression(Dict[str, object]):
# pylint: disable=too-few-public-methods
def __init__(self,
data: Mapping[str, object],
info: QAPISourceInfo,
doc: Optional['QAPIDoc'] = None):
super().__init__(data)
self.info = info
self.doc: Optional['QAPIDoc'] = doc
class QAPIParseError(QAPISourceError):
"""Error class for all QAPI schema parsing errors."""
def __init__(self, parser: 'QAPISchemaParser', msg: str):
col = 1
for ch in parser.src[parser.line_pos:parser.pos]:
if ch == '\t':
col = (col + 7) % 8 + 1
else:
col += 1
super().__init__(parser.info, msg, col)
class QAPISchemaParser:
"""
Parse QAPI schema source.
Parse a JSON-esque schema file and process directives. See
qapi-code-gen.txt section "Schema Syntax" for the exact syntax.
Grammatical validation is handled later by `expr.check_exprs()`.
:param fname: Source file name.
:param previously_included:
The absolute names of previously included source files,
if being invoked from another parser.
:param incl_info:
`QAPISourceInfo` belonging to the parent module.
``None`` implies this is the root module.
:ivar exprs: Resulting parsed expressions.
:ivar docs: Resulting parsed documentation blocks.
:raise OSError: For problems reading the root schema document.
:raise QAPIError: For errors in the schema source.
"""
def __init__(self,
fname: str,
previously_included: Optional[Set[str]] = None,
incl_info: Optional[QAPISourceInfo] = None):
self._fname = fname
self._included = previously_included or set()
self._included.add(os.path.abspath(self._fname))
self.src = ''
# Lexer state (see `accept` for details):
self.info = QAPISourceInfo(self._fname, incl_info)
self.tok: Union[None, str] = None
self.pos = 0
self.cursor = 0
self.val: Optional[Union[bool, str]] = None
self.line_pos = 0
# Parser output:
self.exprs: List[QAPIExpression] = []
self.docs: List[QAPIDoc] = []
# Showtime!
self._parse()
def _parse(self) -> None:
"""
Parse the QAPI schema document.
:return: None. Results are stored in ``.exprs`` and ``.docs``.
"""
cur_doc = None
# May raise OSError; allow the caller to handle it.
with open(self._fname, 'r', encoding='utf-8') as fp:
self.src = fp.read()
if self.src == '' or self.src[-1] != '\n':
self.src += '\n'
# Prime the lexer:
self.accept()
# Parse until done:
while self.tok is not None:
info = self.info
if self.tok == '#':
self.reject_expr_doc(cur_doc)
for cur_doc in self.get_doc(info):
self.docs.append(cur_doc)
continue
expr = self.get_expr()
if not isinstance(expr, dict):
raise QAPISemError(
info, "top-level expression must be an object")
if 'include' in expr:
self.reject_expr_doc(cur_doc)
if len(expr) != 1:
raise QAPISemError(info, "invalid 'include' directive")
include = expr['include']
if not isinstance(include, str):
raise QAPISemError(info,
"value of 'include' must be a string")
incl_fname = os.path.join(os.path.dirname(self._fname),
include)
self._add_expr(OrderedDict({'include': incl_fname}), info)
exprs_include = self._include(include, info, incl_fname,
self._included)
if exprs_include:
self.exprs.extend(exprs_include.exprs)
self.docs.extend(exprs_include.docs)
elif "pragma" in expr:
self.reject_expr_doc(cur_doc)
if len(expr) != 1:
raise QAPISemError(info, "invalid 'pragma' directive")
pragma = expr['pragma']
if not isinstance(pragma, dict):
raise QAPISemError(
info, "value of 'pragma' must be an object")
for name, value in pragma.items():
self._pragma(name, value, info)
else:
if cur_doc and not cur_doc.symbol:
raise QAPISemError(
cur_doc.info, "definition documentation required")
self._add_expr(expr, info, cur_doc)
cur_doc = None
self.reject_expr_doc(cur_doc)
def _add_expr(self, expr: Mapping[str, object],
info: QAPISourceInfo,
doc: Optional['QAPIDoc'] = None) -> None:
self.exprs.append(QAPIExpression(expr, info, doc))
@staticmethod
def reject_expr_doc(doc: Optional['QAPIDoc']) -> None:
if doc and doc.symbol:
raise QAPISemError(
doc.info,
"documentation for '%s' is not followed by the definition"
% doc.symbol)
@staticmethod
def _include(include: str,
info: QAPISourceInfo,
incl_fname: str,
previously_included: Set[str]
) -> Optional['QAPISchemaParser']:
incl_abs_fname = os.path.abspath(incl_fname)
# catch inclusion cycle
inf: Optional[QAPISourceInfo] = info
while inf:
if incl_abs_fname == os.path.abspath(inf.fname):
raise QAPISemError(info, "inclusion loop for %s" % include)
inf = inf.parent
# skip multiple include of the same file
if incl_abs_fname in previously_included:
return None
qapi/parser: Don't try to handle file errors Fixes: f5d4361cda Fixes: 52a474180a Fixes: 46f49468c6 Remove the try/except block that handles file-opening errors in QAPISchemaParser.__init__() and add one each to QAPISchemaParser._include() and QAPISchema.__init__() respectively. This simultaneously fixes the typing of info.fname (f5d4361cda), A static typing violation in test-qapi (46f49468c6), and a regression of an error message (52a474180a). The short-ish version of what motivates this patch is: - It's hard to write a good error message in the init method, because we need to determine the context of our caller to do so. It's easier to just let the caller write the message. - We don't want to allow QAPISourceInfo(None, None, None) to exist. The typing introduced by commit f5d4361cda types the 'fname' field as (non-optional) str, which was premature until the removal of this construct. - Errors made using such an object are currently incorrect (since 52a474180a) - It's not technically a semantic error if we cannot open the schema. - There are various typing constraints that make mixing these two cases undesirable for a single special case. - test-qapi's code handling an fname of 'None' is now dead, drop it. Additionally, Not all QAPIError objects have an 'info' field (since 46f49468), so deleting this stanza corrects a typing oversight in test-qapi introduced by that commit. Other considerations: - open() is moved to a 'with' block to ensure file pointers are cleaned up deterministically. - Python 3.3 deprecated IOError and made it a synonym for OSError. Avoid the misleading perception these exception handlers are narrower than they really are. The long version: The error message here is incorrect (since commit 52a474180a): > python3 qapi-gen.py 'fake.json' qapi-gen.py: qapi-gen.py: can't read schema file 'fake.json': No such file or directory In pursuing it, we find that QAPISourceInfo has a special accommodation for when there's no filename. Meanwhile, the intent when QAPISourceInfo was typed (f5d4361cda) was non-optional 'str'. This usage was overlooked. To remove this, I'd want to avoid having a "fake" QAPISourceInfo object. I also don't want to explicitly begin accommodating QAPISourceInfo itself being None, because we actually want to eventually prove that this can never happen -- We don't want to confuse "The file isn't open yet" with "This error stems from a definition that wasn't defined in any file". (An earlier series tried to create a dummy info object, but it was tough to prove in review that it worked correctly without creating new regressions. This patch avoids that distraction. We would like to first prove that we never raise QAPISemError for any built-in object before we add "special" info objects. We aren't ready to do that yet.) So, which way out of the labyrinth? Here's one way: Don't try to handle errors at a level with "mixed" semantic contexts; i.e. don't mix inclusion errors (should report a source line where the include was triggered) and command line errors (where we specified a file we couldn't read). Remove the error handling from the initializer of the parser. Pythonic! Now it's the caller's job to figure out what to do about it. Handle the error in QAPISchemaParser._include() instead, where we can write a targeted error message where we are guaranteed to have an 'info' context to report with. The root level error can similarly move to QAPISchema.__init__(), where we know we'll never have an info context to report with, so we use a more abstract error type. Now the error looks sensible again: > python3 qapi-gen.py 'fake.json' qapi-gen.py: can't read schema file 'fake.json': No such file or directory With these error cases separated, QAPISourceInfo can be solidified as never having placeholder arguments that violate our desired types. Clean up test-qapi along similar lines. Signed-off-by: John Snow <jsnow@redhat.com> Message-Id: <20210519183951.3946870-2-jsnow@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com>
2021-05-19 20:39:37 +02:00
try:
return QAPISchemaParser(incl_fname, previously_included, info)
except OSError as err:
raise QAPISemError(
info,
f"can't read include file '{incl_fname}': {err.strerror}"
) from err
@staticmethod
def _pragma(name: str, value: object, info: QAPISourceInfo) -> None:
def check_list_str(name: str, value: object) -> List[str]:
if (not isinstance(value, list) or
any(not isinstance(elt, str) for elt in value)):
raise QAPISemError(
info,
"pragma %s must be a list of strings" % name)
return value
pragma = info.pragma
if name == 'doc-required':
if not isinstance(value, bool):
raise QAPISemError(info,
"pragma 'doc-required' must be boolean")
pragma.doc_required = value
elif name == 'command-name-exceptions':
pragma.command_name_exceptions = check_list_str(name, value)
elif name == 'command-returns-exceptions':
pragma.command_returns_exceptions = check_list_str(name, value)
elif name == 'member-name-exceptions':
pragma.member_name_exceptions = check_list_str(name, value)
else:
raise QAPISemError(info, "unknown pragma '%s'" % name)
def accept(self, skip_comment: bool = True) -> None:
"""
Read and store the next token.
:param skip_comment:
When false, return COMMENT tokens ("#").
This is used when reading documentation blocks.
:return:
None. Several instance attributes are updated instead:
- ``.tok`` represents the token type. See below for values.
- ``.info`` describes the token's source location.
- ``.val`` is the token's value, if any. See below.
- ``.pos`` is the buffer index of the first character of
the token.
* Single-character tokens:
These are "{", "}", ":", ",", "[", and "]".
``.tok`` holds the single character and ``.val`` is None.
* Multi-character tokens:
* COMMENT:
This token is not normally returned by the lexer, but it can
be when ``skip_comment`` is False. ``.tok`` is "#", and
``.val`` is a string including all chars until end-of-line,
including the "#" itself.
* STRING:
``.tok`` is "'", the single quote. ``.val`` contains the
string, excluding the surrounding quotes.
* TRUE and FALSE:
``.tok`` is either "t" or "f", ``.val`` will be the
corresponding bool value.
* EOF:
``.tok`` and ``.val`` will both be None at EOF.
"""
while True:
self.tok = self.src[self.cursor]
self.pos = self.cursor
self.cursor += 1
self.val = None
if self.tok == '#':
if self.src[self.cursor] == '#':
# Start of doc comment
skip_comment = False
self.cursor = self.src.find('\n', self.cursor)
if not skip_comment:
self.val = self.src[self.pos:self.cursor]
return
elif self.tok in '{}:,[]':
return
elif self.tok == "'":
# Note: we accept only printable ASCII
string = ''
esc = False
while True:
ch = self.src[self.cursor]
self.cursor += 1
if ch == '\n':
raise QAPIParseError(self, "missing terminating \"'\"")
if esc:
# Note: we recognize only \\ because we have
# no use for funny characters in strings
if ch != '\\':
raise QAPIParseError(self,
"unknown escape \\%s" % ch)
esc = False
elif ch == '\\':
esc = True
continue
elif ch == "'":
self.val = string
return
if ord(ch) < 32 or ord(ch) >= 127:
raise QAPIParseError(
self, "funny character in string")
string += ch
elif self.src.startswith('true', self.pos):
self.val = True
self.cursor += 3
return
elif self.src.startswith('false', self.pos):
self.val = False
self.cursor += 4
return
elif self.tok == '\n':
if self.cursor == len(self.src):
self.tok = None
return
self.info = self.info.next_line()
self.line_pos = self.cursor
elif not self.tok.isspace():
# Show up to next structural, whitespace or quote
# character
match = must_match('[^[\\]{}:,\\s\'"]+',
self.src[self.cursor-1:])
raise QAPIParseError(self, "stray '%s'" % match.group(0))
def get_members(self) -> Dict[str, object]:
expr: Dict[str, object] = OrderedDict()
if self.tok == '}':
self.accept()
return expr
if self.tok != "'":
raise QAPIParseError(self, "expected string or '}'")
while True:
key = self.val
assert isinstance(key, str) # Guaranteed by tok == "'"
self.accept()
if self.tok != ':':
raise QAPIParseError(self, "expected ':'")
self.accept()
if key in expr:
raise QAPIParseError(self, "duplicate key '%s'" % key)
expr[key] = self.get_expr()
if self.tok == '}':
self.accept()
return expr
if self.tok != ',':
raise QAPIParseError(self, "expected ',' or '}'")
self.accept()
if self.tok != "'":
raise QAPIParseError(self, "expected string")
def get_values(self) -> List[object]:
expr: List[object] = []
if self.tok == ']':
self.accept()
return expr
if self.tok not in tuple("{['tf"):
raise QAPIParseError(
self, "expected '{', '[', ']', string, or boolean")
while True:
expr.append(self.get_expr())
if self.tok == ']':
self.accept()
return expr
if self.tok != ',':
raise QAPIParseError(self, "expected ',' or ']'")
self.accept()
def get_expr(self) -> _ExprValue:
expr: _ExprValue
if self.tok == '{':
self.accept()
expr = self.get_members()
elif self.tok == '[':
self.accept()
expr = self.get_values()
elif self.tok in tuple("'tf"):
assert isinstance(self.val, (str, bool))
expr = self.val
self.accept()
else:
raise QAPIParseError(
self, "expected '{', '[', string, or boolean")
return expr
def get_doc(self, info: QAPISourceInfo) -> List['QAPIDoc']:
if self.val != '##':
raise QAPIParseError(
self, "junk after '##' at start of documentation comment")
docs = []
cur_doc = QAPIDoc(self, info)
self.accept(False)
while self.tok == '#':
assert isinstance(self.val, str)
if self.val.startswith('##'):
# End of doc comment
if self.val != '##':
raise QAPIParseError(
self,
"junk after '##' at end of documentation comment")
cur_doc.end_comment()
docs.append(cur_doc)
self.accept()
return docs
if self.val.startswith('# ='):
if cur_doc.symbol:
raise QAPIParseError(
self,
"unexpected '=' markup in definition documentation")
if cur_doc.body.text:
cur_doc.end_comment()
docs.append(cur_doc)
cur_doc = QAPIDoc(self, info)
cur_doc.append(self.val)
self.accept(False)
raise QAPIParseError(self, "documentation comment must end with '##'")
class QAPIDoc:
"""
A documentation comment block, either definition or free-form
Definition documentation blocks consist of
* a body section: one line naming the definition, followed by an
overview (any number of lines)
* argument sections: a description of each argument (for commands
and events) or member (for structs, unions and alternates)
* features sections: a description of each feature flag
* additional (non-argument) sections, possibly tagged
Free-form documentation blocks consist only of a body section.
"""
class Section:
# pylint: disable=too-few-public-methods
def __init__(self, parser: QAPISchemaParser,
name: Optional[str] = None, indent: int = 0):
# parser, for error messages about indentation
self._parser = parser
# optional section name (argument/member or section name)
self.name = name
self.text = ''
# the expected indent level of the text of this section
self._indent = indent
def append(self, line: str) -> None:
# Strip leading spaces corresponding to the expected indent level
# Blank lines are always OK.
if line:
indent = must_match(r'\s*', line).end()
if indent < self._indent:
raise QAPIParseError(
self._parser,
"unexpected de-indent (expected at least %d spaces)" %
self._indent)
line = line[self._indent:]
self.text += line.rstrip() + '\n'
class ArgSection(Section):
def __init__(self, parser: QAPISchemaParser,
name: str, indent: int = 0):
super().__init__(parser, name, indent)
qapi/parser: add import cycle workaround Adding static types causes a cycle in the QAPI generator: [schema -> expr -> parser -> schema]. It exists because the QAPIDoc class needs the names of types defined by the schema module, but the schema module needs to import both expr.py/parser.py to do its actual parsing. Ultimately, the layering violation is that parser.py should not have any knowledge of specifics of the Schema. QAPIDoc performs double-duty here both as a parser *and* as a finalized object that is part of the schema. In this patch, add the offending type hints alongside the workaround to avoid the cycle becoming a problem at runtime. See https://mypy.readthedocs.io/en/latest/runtime_troubles.html#import-cycles for more information on this workaround technique. I see three ultimate resolutions here: (1) Just keep this patch and use the TYPE_CHECKING trick to eliminate the cycle which is only present during static analysis. (2) Don't bother to annotate connect_member() et al, give them 'object' or 'Any'. I don't particularly like this, because it diminishes the usefulness of type hints for documentation purposes. Still, it's an extremely quick fix. (3) Reimplement doc <--> definition correlation directly in schema.py, integrating doc fields directly into QAPISchemaMember and relieving the QAPIDoc class of the responsibility. Users of the information would instead visit the members first and retrieve their documentation instead of the inverse operation -- visiting the documentation and retrieving their members. My preference is (3), but in the short-term (1) is the easiest way to have my cake (strong type hints) and eat it too (Not have import cycles). Do (1) for now, but plan for (3). Signed-off-by: John Snow <jsnow@redhat.com> Message-Id: <20210930205716.1148693-9-jsnow@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com>
2021-09-30 22:57:11 +02:00
self.member: Optional['QAPISchemaMember'] = None
qapi/parser: add import cycle workaround Adding static types causes a cycle in the QAPI generator: [schema -> expr -> parser -> schema]. It exists because the QAPIDoc class needs the names of types defined by the schema module, but the schema module needs to import both expr.py/parser.py to do its actual parsing. Ultimately, the layering violation is that parser.py should not have any knowledge of specifics of the Schema. QAPIDoc performs double-duty here both as a parser *and* as a finalized object that is part of the schema. In this patch, add the offending type hints alongside the workaround to avoid the cycle becoming a problem at runtime. See https://mypy.readthedocs.io/en/latest/runtime_troubles.html#import-cycles for more information on this workaround technique. I see three ultimate resolutions here: (1) Just keep this patch and use the TYPE_CHECKING trick to eliminate the cycle which is only present during static analysis. (2) Don't bother to annotate connect_member() et al, give them 'object' or 'Any'. I don't particularly like this, because it diminishes the usefulness of type hints for documentation purposes. Still, it's an extremely quick fix. (3) Reimplement doc <--> definition correlation directly in schema.py, integrating doc fields directly into QAPISchemaMember and relieving the QAPIDoc class of the responsibility. Users of the information would instead visit the members first and retrieve their documentation instead of the inverse operation -- visiting the documentation and retrieving their members. My preference is (3), but in the short-term (1) is the easiest way to have my cake (strong type hints) and eat it too (Not have import cycles). Do (1) for now, but plan for (3). Signed-off-by: John Snow <jsnow@redhat.com> Message-Id: <20210930205716.1148693-9-jsnow@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com>
2021-09-30 22:57:11 +02:00
def connect(self, member: 'QAPISchemaMember') -> None:
self.member = member
class NullSection(Section):
"""
Immutable dummy section for use at the end of a doc block.
"""
# pylint: disable=too-few-public-methods
def append(self, line: str) -> None:
assert False, "Text appended after end_comment() called."
def __init__(self, parser: QAPISchemaParser, info: QAPISourceInfo):
# self._parser is used to report errors with QAPIParseError. The
# resulting error position depends on the state of the parser.
# It happens to be the beginning of the comment. More or less
# servicable, but action at a distance.
self._parser = parser
self.info = info
self.symbol: Optional[str] = None
self.body = QAPIDoc.Section(parser)
# dicts mapping parameter/feature names to their ArgSection
self.args: Dict[str, QAPIDoc.ArgSection] = OrderedDict()
self.features: Dict[str, QAPIDoc.ArgSection] = OrderedDict()
self.sections: List[QAPIDoc.Section] = []
# the current section
self._section = self.body
self._append_line = self._append_body_line
def has_section(self, name: str) -> bool:
"""Return True if we have a section with this name."""
for i in self.sections:
if i.name == name:
return True
return False
def append(self, line: str) -> None:
"""
Parse a comment line and add it to the documentation.
The way that the line is dealt with depends on which part of
the documentation we're parsing right now:
* The body section: ._append_line is ._append_body_line
* An argument section: ._append_line is ._append_args_line
* A features section: ._append_line is ._append_features_line
* An additional section: ._append_line is ._append_various_line
"""
line = line[1:]
if not line:
self._append_freeform(line)
return
if line[0] != ' ':
raise QAPIParseError(self._parser, "missing space after #")
line = line[1:]
self._append_line(line)
def end_comment(self) -> None:
self._switch_section(QAPIDoc.NullSection(self._parser))
@staticmethod
def _is_section_tag(name: str) -> bool:
return name in ('Returns:', 'Since:',
# those are often singular or plural
'Note:', 'Notes:',
'Example:', 'Examples:',
'TODO:')
def _append_body_line(self, line: str) -> None:
"""
Process a line of documentation text in the body section.
If this a symbol line and it is the section's first line, this
is a definition documentation block for that symbol.
If it's a definition documentation block, another symbol line
begins the argument section for the argument named by it, and
a section tag begins an additional section. Start that
section and append the line to it.
Else, append the line to the current section.
"""
name = line.split(' ', 1)[0]
# FIXME not nice: things like '# @foo:' and '# @foo: ' aren't
# recognized, and get silently treated as ordinary text
if not self.symbol and not self.body.text and line.startswith('@'):
if not line.endswith(':'):
raise QAPIParseError(self._parser, "line should end with ':'")
self.symbol = line[1:-1]
# Invalid names are not checked here, but the name provided MUST
# match the following definition, which *is* validated in expr.py.
if not self.symbol:
raise QAPIParseError(
self._parser, "name required after '@'")
elif self.symbol:
# This is a definition documentation block
if name.startswith('@') and name.endswith(':'):
self._append_line = self._append_args_line
self._append_args_line(line)
elif line == 'Features:':
self._append_line = self._append_features_line
elif self._is_section_tag(name):
self._append_line = self._append_various_line
self._append_various_line(line)
else:
scripts/qapi: Move doc-comment whitespace stripping to doc.py As we accumulate lines from doc comments when parsing the JSON, the QAPIDoc class generally strips leading and trailing whitespace using line.strip() when it calls _append_freeform(). This is fine for Texinfo, but for rST leading whitespace is significant. We'd like to move to having the text in doc comments be rST format rather than a custom syntax, so move the removal of leading whitespace from the QAPIDoc class to the texinfo-specific processing code in texi_format() in qapi/doc.py. (Trailing whitespace will always be stripped by the rstrip() in Section::append regardless.) In a followup commit we will make the whitespace in the lines of doc comment sections more consistently follow the input source. There is no change to the generated .texi files before and after this commit. Because the qapi-schema test checks the exact values of the documentation comments against a reference, we need to update that reference to match the new whitespace. In the first four places this is now correctly checking that we did put in the amount of whitespace to pass a rST-formatted list to the backend; in the last two places the extra whitespace is 'wrong' and will go away again in the following commit. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Message-Id: <20200925162316.21205-5-peter.maydell@linaro.org> Signed-off-by: Markus Armbruster <armbru@redhat.com>
2020-09-25 18:22:59 +02:00
self._append_freeform(line)
else:
# This is a free-form documentation block
scripts/qapi: Move doc-comment whitespace stripping to doc.py As we accumulate lines from doc comments when parsing the JSON, the QAPIDoc class generally strips leading and trailing whitespace using line.strip() when it calls _append_freeform(). This is fine for Texinfo, but for rST leading whitespace is significant. We'd like to move to having the text in doc comments be rST format rather than a custom syntax, so move the removal of leading whitespace from the QAPIDoc class to the texinfo-specific processing code in texi_format() in qapi/doc.py. (Trailing whitespace will always be stripped by the rstrip() in Section::append regardless.) In a followup commit we will make the whitespace in the lines of doc comment sections more consistently follow the input source. There is no change to the generated .texi files before and after this commit. Because the qapi-schema test checks the exact values of the documentation comments against a reference, we need to update that reference to match the new whitespace. In the first four places this is now correctly checking that we did put in the amount of whitespace to pass a rST-formatted list to the backend; in the last two places the extra whitespace is 'wrong' and will go away again in the following commit. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Message-Id: <20200925162316.21205-5-peter.maydell@linaro.org> Signed-off-by: Markus Armbruster <armbru@redhat.com>
2020-09-25 18:22:59 +02:00
self._append_freeform(line)
def _append_args_line(self, line: str) -> None:
"""
Process a line of documentation text in an argument section.
A symbol line begins the next argument section, a section tag
section or a non-indented line after a blank line begins an
additional section. Start that section and append the line to
it.
Else, append the line to the current section.
"""
name = line.split(' ', 1)[0]
if name.startswith('@') and name.endswith(':'):
# If line is "@arg: first line of description", find
# the index of 'f', which is the indent we expect for any
# following lines. We then remove the leading "@arg:"
# from line and replace it with spaces so that 'f' has the
# same index as it did in the original line and can be
# handled the same way we will handle following lines.
indent = must_match(r'@\S*:\s*', line).end()
line = line[indent:]
if not line:
# Line was just the "@arg:" header; following lines
# are not indented
indent = 0
else:
line = ' ' * indent + line
self._start_args_section(name[1:-1], indent)
elif self._is_section_tag(name):
self._append_line = self._append_various_line
self._append_various_line(line)
return
elif (self._section.text.endswith('\n\n')
and line and not line[0].isspace()):
if line == 'Features:':
self._append_line = self._append_features_line
else:
self._start_section()
self._append_line = self._append_various_line
self._append_various_line(line)
return
scripts/qapi: Move doc-comment whitespace stripping to doc.py As we accumulate lines from doc comments when parsing the JSON, the QAPIDoc class generally strips leading and trailing whitespace using line.strip() when it calls _append_freeform(). This is fine for Texinfo, but for rST leading whitespace is significant. We'd like to move to having the text in doc comments be rST format rather than a custom syntax, so move the removal of leading whitespace from the QAPIDoc class to the texinfo-specific processing code in texi_format() in qapi/doc.py. (Trailing whitespace will always be stripped by the rstrip() in Section::append regardless.) In a followup commit we will make the whitespace in the lines of doc comment sections more consistently follow the input source. There is no change to the generated .texi files before and after this commit. Because the qapi-schema test checks the exact values of the documentation comments against a reference, we need to update that reference to match the new whitespace. In the first four places this is now correctly checking that we did put in the amount of whitespace to pass a rST-formatted list to the backend; in the last two places the extra whitespace is 'wrong' and will go away again in the following commit. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Message-Id: <20200925162316.21205-5-peter.maydell@linaro.org> Signed-off-by: Markus Armbruster <armbru@redhat.com>
2020-09-25 18:22:59 +02:00
self._append_freeform(line)
def _append_features_line(self, line: str) -> None:
name = line.split(' ', 1)[0]
if name.startswith('@') and name.endswith(':'):
# If line is "@arg: first line of description", find
# the index of 'f', which is the indent we expect for any
# following lines. We then remove the leading "@arg:"
# from line and replace it with spaces so that 'f' has the
# same index as it did in the original line and can be
# handled the same way we will handle following lines.
indent = must_match(r'@\S*:\s*', line).end()
line = line[indent:]
if not line:
# Line was just the "@arg:" header; following lines
# are not indented
indent = 0
else:
line = ' ' * indent + line
self._start_features_section(name[1:-1], indent)
elif self._is_section_tag(name):
self._append_line = self._append_various_line
self._append_various_line(line)
return
elif (self._section.text.endswith('\n\n')
and line and not line[0].isspace()):
self._start_section()
self._append_line = self._append_various_line
self._append_various_line(line)
return
scripts/qapi: Move doc-comment whitespace stripping to doc.py As we accumulate lines from doc comments when parsing the JSON, the QAPIDoc class generally strips leading and trailing whitespace using line.strip() when it calls _append_freeform(). This is fine for Texinfo, but for rST leading whitespace is significant. We'd like to move to having the text in doc comments be rST format rather than a custom syntax, so move the removal of leading whitespace from the QAPIDoc class to the texinfo-specific processing code in texi_format() in qapi/doc.py. (Trailing whitespace will always be stripped by the rstrip() in Section::append regardless.) In a followup commit we will make the whitespace in the lines of doc comment sections more consistently follow the input source. There is no change to the generated .texi files before and after this commit. Because the qapi-schema test checks the exact values of the documentation comments against a reference, we need to update that reference to match the new whitespace. In the first four places this is now correctly checking that we did put in the amount of whitespace to pass a rST-formatted list to the backend; in the last two places the extra whitespace is 'wrong' and will go away again in the following commit. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Message-Id: <20200925162316.21205-5-peter.maydell@linaro.org> Signed-off-by: Markus Armbruster <armbru@redhat.com>
2020-09-25 18:22:59 +02:00
self._append_freeform(line)
def _append_various_line(self, line: str) -> None:
"""
Process a line of documentation text in an additional section.
A symbol line is an error.
A section tag begins an additional section. Start that
section and append the line to it.
Else, append the line to the current section.
"""
name = line.split(' ', 1)[0]
if name.startswith('@') and name.endswith(':'):
raise QAPIParseError(self._parser,
"'%s' can't follow '%s' section"
% (name, self.sections[0].name))
if self._is_section_tag(name):
# If line is "Section: first line of description", find
# the index of 'f', which is the indent we expect for any
# following lines. We then remove the leading "Section:"
# from line and replace it with spaces so that 'f' has the
# same index as it did in the original line and can be
# handled the same way we will handle following lines.
indent = must_match(r'\S*:\s*', line).end()
line = line[indent:]
if not line:
# Line was just the "Section:" header; following lines
# are not indented
indent = 0
else:
line = ' ' * indent + line
self._start_section(name[:-1], indent)
self._append_freeform(line)
def _start_symbol_section(
self,
symbols_dict: Dict[str, 'QAPIDoc.ArgSection'],
name: str,
indent: int) -> None:
# FIXME invalid names other than the empty string aren't flagged
if not name:
raise QAPIParseError(self._parser, "invalid parameter name")
if name in symbols_dict:
raise QAPIParseError(self._parser,
"'%s' parameter name duplicated" % name)
assert not self.sections
new_section = QAPIDoc.ArgSection(self._parser, name, indent)
self._switch_section(new_section)
symbols_dict[name] = new_section
def _start_args_section(self, name: str, indent: int) -> None:
self._start_symbol_section(self.args, name, indent)
def _start_features_section(self, name: str, indent: int) -> None:
self._start_symbol_section(self.features, name, indent)
def _start_section(self, name: Optional[str] = None,
indent: int = 0) -> None:
if name in ('Returns', 'Since') and self.has_section(name):
raise QAPIParseError(self._parser,
"duplicated '%s' section" % name)
new_section = QAPIDoc.Section(self._parser, name, indent)
self._switch_section(new_section)
self.sections.append(new_section)
def _switch_section(self, new_section: 'QAPIDoc.Section') -> None:
text = self._section.text = self._section.text.strip()
# Only the 'body' section is allowed to have an empty body.
# All other sections, including anonymous ones, must have text.
if self._section != self.body and not text:
# We do not create anonymous sections unless there is
# something to put in them; this is a parser bug.
assert self._section.name
raise QAPIParseError(
self._parser,
"empty doc section '%s'" % self._section.name)
self._section = new_section
def _append_freeform(self, line: str) -> None:
match = re.match(r'(@\S+:)', line)
if match:
raise QAPIParseError(self._parser,
"'%s' not allowed in free-form documentation"
% match.group(1))
self._section.append(line)
qapi/parser: add import cycle workaround Adding static types causes a cycle in the QAPI generator: [schema -> expr -> parser -> schema]. It exists because the QAPIDoc class needs the names of types defined by the schema module, but the schema module needs to import both expr.py/parser.py to do its actual parsing. Ultimately, the layering violation is that parser.py should not have any knowledge of specifics of the Schema. QAPIDoc performs double-duty here both as a parser *and* as a finalized object that is part of the schema. In this patch, add the offending type hints alongside the workaround to avoid the cycle becoming a problem at runtime. See https://mypy.readthedocs.io/en/latest/runtime_troubles.html#import-cycles for more information on this workaround technique. I see three ultimate resolutions here: (1) Just keep this patch and use the TYPE_CHECKING trick to eliminate the cycle which is only present during static analysis. (2) Don't bother to annotate connect_member() et al, give them 'object' or 'Any'. I don't particularly like this, because it diminishes the usefulness of type hints for documentation purposes. Still, it's an extremely quick fix. (3) Reimplement doc <--> definition correlation directly in schema.py, integrating doc fields directly into QAPISchemaMember and relieving the QAPIDoc class of the responsibility. Users of the information would instead visit the members first and retrieve their documentation instead of the inverse operation -- visiting the documentation and retrieving their members. My preference is (3), but in the short-term (1) is the easiest way to have my cake (strong type hints) and eat it too (Not have import cycles). Do (1) for now, but plan for (3). Signed-off-by: John Snow <jsnow@redhat.com> Message-Id: <20210930205716.1148693-9-jsnow@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com>
2021-09-30 22:57:11 +02:00
def connect_member(self, member: 'QAPISchemaMember') -> None:
if member.name not in self.args:
# Undocumented TODO outlaw
self.args[member.name] = QAPIDoc.ArgSection(self._parser,
member.name)
self.args[member.name].connect(member)
qapi/parser: add import cycle workaround Adding static types causes a cycle in the QAPI generator: [schema -> expr -> parser -> schema]. It exists because the QAPIDoc class needs the names of types defined by the schema module, but the schema module needs to import both expr.py/parser.py to do its actual parsing. Ultimately, the layering violation is that parser.py should not have any knowledge of specifics of the Schema. QAPIDoc performs double-duty here both as a parser *and* as a finalized object that is part of the schema. In this patch, add the offending type hints alongside the workaround to avoid the cycle becoming a problem at runtime. See https://mypy.readthedocs.io/en/latest/runtime_troubles.html#import-cycles for more information on this workaround technique. I see three ultimate resolutions here: (1) Just keep this patch and use the TYPE_CHECKING trick to eliminate the cycle which is only present during static analysis. (2) Don't bother to annotate connect_member() et al, give them 'object' or 'Any'. I don't particularly like this, because it diminishes the usefulness of type hints for documentation purposes. Still, it's an extremely quick fix. (3) Reimplement doc <--> definition correlation directly in schema.py, integrating doc fields directly into QAPISchemaMember and relieving the QAPIDoc class of the responsibility. Users of the information would instead visit the members first and retrieve their documentation instead of the inverse operation -- visiting the documentation and retrieving their members. My preference is (3), but in the short-term (1) is the easiest way to have my cake (strong type hints) and eat it too (Not have import cycles). Do (1) for now, but plan for (3). Signed-off-by: John Snow <jsnow@redhat.com> Message-Id: <20210930205716.1148693-9-jsnow@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com>
2021-09-30 22:57:11 +02:00
def connect_feature(self, feature: 'QAPISchemaFeature') -> None:
if feature.name not in self.features:
raise QAPISemError(feature.info,
"feature '%s' lacks documentation"
% feature.name)
self.features[feature.name].connect(feature)
def check_expr(self, expr: QAPIExpression) -> None:
if self.has_section('Returns') and 'command' not in expr:
raise QAPISemError(self.info,
"'Returns:' is only valid for commands")
def check(self) -> None:
def check_args_section(
args: Dict[str, QAPIDoc.ArgSection], what: str
) -> None:
bogus = [name for name, section in args.items()
if not section.member]
if bogus:
raise QAPISemError(
self.info,
"documented %s%s '%s' %s not exist" % (
what,
"s" if len(bogus) > 1 else "",
"', '".join(bogus),
"do" if len(bogus) > 1 else "does"
))
check_args_section(self.args, 'member')
check_args_section(self.features, 'feature')