Go to file
Marek Polacek 51c500269b libcpp: Implement -Wbidi-chars for CVE-2021-42574 [PR103026]
From a link below:
"An issue was discovered in the Bidirectional Algorithm in the Unicode
Specification through 14.0. It permits the visual reordering of
characters via control sequences, which can be used to craft source code
that renders different logic than the logical ordering of tokens
ingested by compilers and interpreters. Adversaries can leverage this to
encode source code for compilers accepting Unicode such that targeted
vulnerabilities are introduced invisibly to human reviewers."

More info:
https://nvd.nist.gov/vuln/detail/CVE-2021-42574
https://trojansource.codes/

This is not a compiler bug.  However, to mitigate the problem, this patch
implements -Wbidi-chars=[none|unpaired|any] to warn about possibly
misleading Unicode bidirectional control characters the preprocessor may
encounter.

The default is =unpaired, which warns about improperly terminated
bidirectional control characters; e.g. a LRE without its corresponding PDF.
The level =any warns about any use of bidirectional control characters.

This patch handles both UCNs and UTF-8 characters.  UCNs designating
bidi characters in identifiers are accepted since r204886.  Then r217144
enabled -fextended-identifiers by default.  Extended characters in C/C++
identifiers have been accepted since r275979.  However, this patch still
warns about mixing UTF-8 and UCN bidi characters; there seems to be no
good reason to allow mixing them.

We warn in different contexts: comments (both C and C++-style), string
literals, character constants, and identifiers.  Expectedly, UCNs are ignored
in comments and raw string literals.  The bidirectional control characters
can nest so this patch handles that as well.

I have not included nor tested this at all with Fortran (which also has
string literals and line comments).

Dave M. posted patches improving diagnostic involving Unicode characters.
This patch does not make use of this new infrastructure yet.

	PR preprocessor/103026

gcc/c-family/ChangeLog:

	* c.opt (Wbidi-chars, Wbidi-chars=): New option.

gcc/ChangeLog:

	* doc/invoke.texi: Document -Wbidi-chars.

libcpp/ChangeLog:

	* include/cpplib.h (enum cpp_bidirectional_level): New.
	(struct cpp_options): Add cpp_warn_bidirectional.
	(enum cpp_warning_reason): Add CPP_W_BIDIRECTIONAL.
	* internal.h (struct cpp_reader): Add warn_bidi_p member
	function.
	* init.c (cpp_create_reader): Set cpp_warn_bidirectional.
	* lex.c (bidi): New namespace.
	(get_bidi_utf8): New function.
	(get_bidi_ucn): Likewise.
	(maybe_warn_bidi_on_close): Likewise.
	(maybe_warn_bidi_on_char): Likewise.
	(_cpp_skip_block_comment): Implement warning about bidirectional
	control characters.
	(skip_line_comment): Likewise.
	(forms_identifier_p): Likewise.
	(lex_identifier): Likewise.
	(lex_string): Likewise.
	(lex_raw_string): Likewise.

gcc/testsuite/ChangeLog:

	* c-c++-common/Wbidi-chars-1.c: New test.
	* c-c++-common/Wbidi-chars-2.c: New test.
	* c-c++-common/Wbidi-chars-3.c: New test.
	* c-c++-common/Wbidi-chars-4.c: New test.
	* c-c++-common/Wbidi-chars-5.c: New test.
	* c-c++-common/Wbidi-chars-6.c: New test.
	* c-c++-common/Wbidi-chars-7.c: New test.
	* c-c++-common/Wbidi-chars-8.c: New test.
	* c-c++-common/Wbidi-chars-9.c: New test.
	* c-c++-common/Wbidi-chars-10.c: New test.
	* c-c++-common/Wbidi-chars-11.c: New test.
	* c-c++-common/Wbidi-chars-12.c: New test.
	* c-c++-common/Wbidi-chars-13.c: New test.
	* c-c++-common/Wbidi-chars-14.c: New test.
	* c-c++-common/Wbidi-chars-15.c: New test.
	* c-c++-common/Wbidi-chars-16.c: New test.
	* c-c++-common/Wbidi-chars-17.c: New test.
2021-11-16 21:56:16 -05:00
c++tools Daily bump. 2021-10-27 00:16:33 +00:00
config
contrib Daily bump. 2021-11-09 00:16:21 +00:00
fixincludes Daily bump. 2021-11-14 00:16:23 +00:00
gcc libcpp: Implement -Wbidi-chars for CVE-2021-42574 [PR103026] 2021-11-16 21:56:16 -05:00
gnattools Daily bump. 2021-10-23 00:16:26 +00:00
gotools
include Daily bump. 2021-11-13 00:16:39 +00:00
INSTALL
intl
libada Daily bump. 2021-10-23 00:16:26 +00:00
libatomic
libbacktrace Daily bump. 2021-11-13 00:16:39 +00:00
libcc1
libcody Daily bump. 2021-11-02 00:16:32 +00:00
libcpp libcpp: Implement -Wbidi-chars for CVE-2021-42574 [PR103026] 2021-11-16 21:56:16 -05:00
libdecnumber Daily bump. 2021-10-23 00:16:26 +00:00
libffi Daily bump. 2021-11-16 00:16:31 +00:00
libgcc Daily bump. 2021-11-12 00:16:32 +00:00
libgfortran Daily bump. 2021-10-19 00:16:23 +00:00
libgo
libgomp Daily bump. 2021-11-17 00:16:29 +00:00
libiberty Daily bump. 2021-10-23 00:16:26 +00:00
libitm
libobjc
liboffloadmic Daily bump. 2021-10-20 00:16:43 +00:00
libphobos Daily bump. 2021-11-01 00:16:20 +00:00
libquadmath
libsanitizer Daily bump. 2021-11-14 00:16:23 +00:00
libssp
libstdc++-v3 Daily bump. 2021-11-17 00:16:29 +00:00
libvtv
lto-plugin
maintainer-scripts
zlib
.dir-locals.el
.gitattributes
.gitignore
ABOUT-NLS
ar-lib
ChangeLog Daily bump. 2021-11-17 00:16:29 +00:00
ChangeLog.jit
ChangeLog.tree-ssa
compile
config-ml.in
config.guess
config.rpath
config.sub
configure configure, Darwin: Set appropriate defaults for host-shared. 2021-11-16 19:44:51 +00:00
configure.ac configure, Darwin: Set appropriate defaults for host-shared. 2021-11-16 19:44:51 +00:00
COPYING
COPYING3
COPYING3.LIB
COPYING.LIB
COPYING.RUNTIME
depcomp
install-sh
libtool-ldflags
libtool.m4
lt~obsolete.m4
ltgcc.m4
ltmain.sh
ltoptions.m4
ltsugar.m4
ltversion.m4
MAINTAINERS MAINTAINERS: Add myself to DCO section and update email address 2021-11-16 22:49:55 +01:00
Makefile.def Make opcodes configure depend on bfd configure 2021-11-12 18:34:12 +10:30
Makefile.in Make opcodes configure depend on bfd configure 2021-11-12 18:34:12 +10:30
Makefile.tpl [PATCH 1/5] Makefile.in: Ensure build CPP/CPPFLAGS is used for build targets 2021-10-28 10:42:49 -04:00
missing
mkdep
mkinstalldirs
move-if-change
multilib.am
README
symlink-tree
test-driver
ylwrap

This directory contains the GNU Compiler Collection (GCC).

The GNU Compiler Collection is free software.  See the files whose
names start with COPYING for copying permission.  The manuals, and
some of the runtime libraries, are under different terms; see the
individual source files for details.

The directory INSTALL contains copies of the installation information
as HTML and plain text.  The source of this information is
gcc/doc/install.texi.  The installation information includes details
of what is included in the GCC sources and what files GCC installs.

See the file gcc/doc/gcc.texi (together with other files that it
includes) for usage and porting information.  An online readable
version of the manual is in the files gcc/doc/gcc.info*.

See http://gcc.gnu.org/bugs/ for how to report bugs usefully.

Copyright years on GCC source files may be listed using range
notation, e.g., 1987-2012, indicating that every year in the range,
inclusive, is a copyrightable year that could otherwise be listed
individually.