2bfb0fcb51510f22723c8cdfefe [Sanitizer][MIPS] Fix stat struct size for the O32 ABI.
Signed-off-by: Dimitrije Milosevic <dimitrije.milosevic@syrmia.com>.
This patch adds rot[lr]64ti2_doubleword patterns to the x86_64 backend,
to move splitting of 128-bit TImode rotates by 64 bits after reload,
matching what we now do for 64-bit DImode rotations by 32 bits with -m32.
In theory moving when this rotation is split should have little
influence on code generation, but in practice "reload" sometimes
decides to make use of the increased flexibility to reduce the number
of registers used, and the code size, by using xchg.
For example:
__int128 x;
__int128 y;
__int128 a;
__int128 b;
void foo()
{
unsigned __int128 t = x;
t ^= a;
t = (t<<64) | (t>>64);
t ^= b;
y = t;
}
Before:
movq x(%rip), %rsi
movq x+8(%rip), %rdi
xorq a(%rip), %rsi
xorq a+8(%rip), %rdi
movq %rdi, %rax
movq %rsi, %rdx
xorq b(%rip), %rax
xorq b+8(%rip), %rdx
movq %rax, y(%rip)
movq %rdx, y+8(%rip)
ret
After:
movq x(%rip), %rax
movq x+8(%rip), %rdx
xorq a(%rip), %rax
xorq a+8(%rip), %rdx
xchgq %rdx, %rax
xorq b(%rip), %rax
xorq b+8(%rip), %rdx
movq %rax, y(%rip)
movq %rdx, y+8(%rip)
ret
One some modern architectures this is a small win, on some older
architectures this is a small loss. The decision which code to
generate is made in "reload", and could probably be tweaked by
register preferencing. The much bigger win is that (eventually) all
TImode mode shifts and rotates by constants will become potential
candidates for TImode STV.
2022-07-31 Roger Sayle <roger@nextmovesoftware.com>
gcc/ChangeLog
* config/i386/i386.md (define_expand <any_rotate>ti3): For
rotations by 64 bits use new rot[lr]64ti2_doubleword pattern.
(rot[lr]64ti2_doubleword): New post-reload splitter.
This patch resolves PR target/106450, some more fall-out from more
aggressive TImode scalar-to-vector (STV) optimizations. I continue
to be caught out by how far TImode STV has diverged from DImode/SImode
STV, and therefore requires additional (unexpected) tweaking. Many
thanks to H.J. Lu for pointing out timode_remove_non_convertible_regs
needs to be extended to handle XOR (and other new operations).
Unhelpfully the comment above this function states that it's the TImode
version of "remove_non_convertible_regs", which doesn't exist anymore,
so I've resurrected an explanatory comment from the git history.
By refactoring the checks for hard regs and already "marked" regs
into timode_check_non_convertible_regs itself, all of its callers are
simplified. This patch then FOR_EACH_INSN_USE and FOR_EACH_INSN_DEF
to generically handle arbitrary (non-move) instructions (including
unary and binary operations), calling timode_check_non_convertible_regs
on each TImode register USE and DEF.
2022-07-31 Roger Sayle <roger@nextmovesoftware.com>
H.J. Lu <hjl.tools@gmail.com>
gcc/ChangeLog
PR target/106450
* config/i386/i386-features.cc (timode_check_non_convertible_regs):
Do nothing if REGNO is set in the REGS bitmap, or is a hard reg.
(timode_remove_non_convertible_regs): Update comment.
Call timode_check_non_convertible_reg on all TImode register
DEFs and USEs in each instruction.
gcc/testsuite/ChangeLog
PR target/106450
* gcc.target/i386/pr106450.c: New test case.
gcc/fortran/ChangeLog:
PR fortran/92805
* match.cc (gfc_match_small_literal_int): Make gobbling of leading
whitespace optional.
(gfc_match_name): Likewise.
(gfc_match_char): Likewise.
* match.h (gfc_match_small_literal_int): Adjust prototype.
(gfc_match_name): Likewise.
(gfc_match_char): Likewise.
* primary.cc (match_kind_param): Match small literal int or name
without gobbling whitespace.
(get_kind): Do not skip over blanks.
(match_string_constant): Likewise.
gcc/testsuite/ChangeLog:
PR fortran/92805
* gfortran.dg/literal_constants.f: New test.
* gfortran.dg/literal_constants.f90: New test.
Co-authored-by: Steven G. Kargl <kargl@gcc.gnu.org>
gcc/fortran/ChangeLog:
PR fortran/77652
* check.cc (gfc_check_associated): Make the rank check of POINTER
vs. TARGET match the allowed forms of pointer assignment for the
selected Fortran standard.
gcc/testsuite/ChangeLog:
PR fortran/77652
* gfortran.dg/associated_target_9a.f90: New test.
* gfortran.dg/associated_target_9b.f90: New test.
In C++, since all tokens are lexed from libcpp up front, diagnostics generated
by libcpp after lexing has completed do not get a valid location from libcpp
(rather, libcpp thinks they all pertain to the end of the file.) This has long
been addressed using the global variable "done_lexing", which the C++ frontend
sets at the appropriate time; when done_lexing is true, then c_cpp_diagnostic(),
which outputs libcpp's diagnostics, uses input_location instead of the wrong
libcpp location. The C++ frontend arranges that input_location will point to the
token it is currently processing, so this generally works fine. However, there
is one exception currently, which is -Wunused-macros. This gets generated at the
end of processing in cpp_finish (), since we need to wait until then to
determine whether a macro was eventually used or not. But the locations it
passes to c_cpp_diagnostic () were remembered from the original lexing and hence
they should not be overridden with input_location, which is now the one
incorrectly pointing to the end of the file.
Fixed by setting done_lexing=false again just prior to calling cpp_finish (). I
also renamed the variable from done_lexing to "override_libcpp_locations", since
it's now not strictly about lexing anymore.
There is no new testcase with this patch, since we already had an xfailed
testcase which is now fixed.
gcc/c-family/ChangeLog:
PR c++/66290
* c-common.h: Rename global done_lexing to
override_libcpp_locations.
* c-common.cc (c_cpp_diagnostic): Likewise.
* c-opts.cc (c_common_finish): Set override_libcpp_locations
(formerly done_lexing) immediately prior to calling cpp_finish ().
gcc/cp/ChangeLog:
PR c++/66290
* parser.cc (cp_lexer_new_main): Rename global done_lexing to
override_libcpp_locations.
gcc/testsuite/ChangeLog:
PR c++/66290
* c-c++-common/pragma-diag-15.c: Remove xfail for C++.
This patch fixes PR bootstrap/106472 by adding a missing dependency
to Makefile.def to allow make bootstrap when configured using
"--enable-languages=go" (and not using make with multiple threads).
2022-07-31 Roger Sayle <roger@nextmovesoftware.com>
ChangeLog
PR bootstrap/106472
* Makefile.def (dependencies): Make configure-target-libgo depend
upon all-target-libbacktrace.
Here the CONSTRUCTOR we were providing for D{} had an entry for the B base
subobject at offset 0 following the entry for the C base, causing
output_constructor_regular_field to ICE due to going backwards. It might be
nice for that function to be more tolerant of empty fields, but it also
seems reasonable for the front end to prune the useless entry.
PR c++/106369
gcc/cp/ChangeLog:
* constexpr.cc (reduced_constant_expression_p): Return false
if a CONSTRUCTOR initializes an empty field.
gcc/testsuite/ChangeLog:
* g++.dg/cpp1z/constexpr-lambda27.C: New test.
The hard register A10 was already allocated for EH_RETURN_STACKADJ_RTX.
(although exception handling and sibling call may not apply at the same time,
but for safety)
gcc/ChangeLog:
* config/xtensa/xtensa.md: Change hard register number used in
the split patterns for indirect sibling call fixups from 10 to 11,
the last free one for the CALL0 ABI.
It takes one machine instruction for both condtional branch and move.
gcc/ChangeLog:
* config/xtensa/xtensa.cc (xtensa_rtx_costs):
Add new case for IF_THEN_ELSE.
This run-time test test pointer-based iteration with collapse,
similar to the '(parallel) simd' test for PR106449 but for 'for'.
libgomp/ChangeLog:
* testsuite/libgomp.c-c++-common/pr106449-2.c: New test.
This adjusts the return type to match the resolution of LWG 3672. There
is no functional difference, because decltype(auto) always deduced a
value anyway, but this makes it simpler and consistent with the working
draft.
libstdc++-v3/ChangeLog:
PR libstdc++/104443
* include/bits/stl_iterator.h (common_iterator::operator->):
Change return type to just auto.
The forward threader failed to check whether it can actually duplicate
blocks. The following adds this in a similar place the backwards threader
performs this check.
PR tree-optimization/106422
* tree-ssa-threadupdate.cc (fwd_jt_path_registry::update_cfg):
Check whether we can copy thread blocks and cancel the thread if not.
* gcc.dg/torture/pr106422.c: New testcase.
The allowed syntaxes of atomic compare don't allow ()s around the condition
of ?:, but we were accepting it in one case for C++.
Fixed thusly.
2022-07-29 Jakub Jelinek <jakub@redhat.com>
PR c++/106448
* parser.cc (cp_parser_omp_atomic): For simple cast followed by
CPP_QUERY token, don't try cp_parser_binary_operation if compare
is true.
* c-c++-common/gomp/atomic-32.c: New test.
There were 2 issues visible on this new testcase, one that we didn't have
special POINTER_TYPE_P handling in a few spots of expand_omp_simd - for
pointers we need to use POINTER_PLUS_EXPR and need to have the non-pointer
part in sizetype, for non-rectangular loop on the other side we can rely on
multiplication factor 1, pointers can't be multiplied, without those changes
we'd ICE. The other issue was that we put n2 expression directly into a
comparison in a condition and regimplified that, for the &a[512] case that
and with gimplification being destructed that unfortunately meant modification
of original fd->loops[?].n2. Fixed by unsharing the expression. This was
causing a runtime failure on the testcase.
2022-07-29 Jakub Jelinek <jakub@redhat.com>
PR middle-end/106449
* omp-expand.cc (expand_omp_simd): Fix up handling of pointer
iterators in non-rectangular simd loops. Unshare fd->loops[i].n2
or n2 before regimplifying it inside of a condition.
* testsuite/libgomp.c-c++-common/pr106449.c: New test.
Tobias mentioned in PR106449 that fold_build_pointer_plus already
fold_converts the second argument to sizetype if it doesn't already
have an integral type gimple compatible with sizetype.
So, this patch simplifies the callers of fold_build_pointer_plus in
omp-expand so that they don't do those conversions manually.
2022-07-29 Jakub Jelinek <jakub@redhat.com>
* omp-expand.cc (expand_omp_for_init_counts, expand_omp_for_init_vars,
extract_omp_for_update_vars, expand_omp_for_ordered_loops,
expand_omp_simd): Don't fold_convert second argument to
fold_build_pointer_plus to sizetype.
.eh_frame DW_EH_PE_pcrel encoding format is not supported by gas <= 2.39.
Check if the assembler support DW_EH_PE_PCREL encoding and define .eh_frame
encoding type.
gcc/ChangeLog:
* config.in: Regenerate.
* config/loongarch/loongarch.h (ASM_PREFERRED_EH_DATA_FORMAT):
Select the value of the macro definition according to whether
HAVE_AS_EH_FRAME_PCREL_ENCODING_SUPPORT is defined.
* configure: Regenerate.
* configure.ac: Reinstate HAVE_AS_EH_FRAME_PCREL_ENCODING_SUPPORT test.
This replaces vect_get_vector_types_for_stmt with get_vectype_for_scalar_type
in vect_recog_bool_pattern.
* tree-vect-patterns.cc (vect_recog_bool_pattern): Use
get_vectype_for_scalar_type instead of
vect_get_vector_types_for_stmt.
This patch implements a new -fanalyzer warning:
-Wanalyzer-putenv-of-auto-var
which complains about stack pointers passed to putenv(3) calls, as
per SEI CERT C Coding Standard rule POS34-C ("Do not call putenv() with
a pointer to an automatic variable as the argument").
For example, given:
#include <stdio.h>
#include <stdlib.h>
void test_arr (void)
{
char arr[] = "NAME=VALUE";
putenv (arr);
}
it emits:
demo.c: In function ‘test_arr’:
demo.c:7:3: warning: ‘putenv’ on a pointer to automatic variable ‘arr’ [POS34-C] [-Wanalyzer-putenv-of-auto-var]
7 | putenv (arr);
| ^~~~~~~~~~~~
‘test_arr’: event 1
|
| 7 | putenv (arr);
| | ^~~~~~~~~~~~
| | |
| | (1) ‘putenv’ on a pointer to automatic variable ‘arr’
|
demo.c:6:8: note: ‘arr’ declared on stack here
6 | char arr[] = "NAME=VALUE";
| ^~~
demo.c:7:3: note: perhaps use ‘setenv’ rather than ‘putenv’
7 | putenv (arr);
| ^~~~~~~~~~~~
gcc/analyzer/ChangeLog:
PR analyzer/105893
* analyzer.opt (Wanalyzer-putenv-of-auto-var): New.
* region-model-impl-calls.cc (class putenv_of_auto_var): New.
(region_model::impl_call_putenv): New.
* region-model.cc (region_model::on_call_pre): Handle putenv.
* region-model.h (region_model::impl_call_putenv): New decl.
gcc/ChangeLog:
PR analyzer/105893
* doc/invoke.texi: Add -Wanalyzer-putenv-of-auto-var.
gcc/testsuite/ChangeLog:
PR analyzer/105893
* gcc.dg/analyzer/putenv-1.c: New test.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
gcc/analyzer/ChangeLog:
* sm-malloc.cc (free_of_non_heap::emit): Add comment about CWE.
* sm-taint.cc (tainted_size::emit): Likewise.
gcc/ChangeLog:
* doc/invoke.texi (-fdiagnostics-show-cwe): Use uref rather than
url.
(Static Analyzer Options): Likewise. Add urefs for all of the
warnings that have associated CWE identifiers.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
Our documentation indicates that it is the `-frounding-math' invocation
option that controls whether we respect what the FENV_ACCESS pragma
would imply, should we implement it, regarding the floating point
environment. It is only a part of the picture however, because the
`-ftrapping-math' invocation option also affects how we handle said
environment. Clarify that in the description of both options then, as
well as the FENV_ACCESS pragma itself.
gcc/
* doc/implement-c.texi (Floating point implementation): Mention
`-fno-trapping-math' in the context of FENV_ACCESS pragma.
* doc/invoke.texi (Optimize Options): Clarify FENV_ACCESS pragma
implication in the descriptions of `-fno-trapping-math' and
`-frounding-math'.
We have unordered FP comparisons implemented as RTL insns that produce
multiple machine instructions. Such RTL insns are hard to match with a
processor pipeline description and additionally there is a redundant
SNEZ instruction produced on the result of these comparisons even though
the FLT.fmt and FLE.fmt machine instructions already produce either 0 or
1, e.g.:
long
flt (double x, double y)
{
return __builtin_isless (x, y);
}
with `-O2 -fno-finite-math-only -ftrapping-math -fno-signaling-nans'
gets compiled to:
.globl flt
.type flt, @function
flt:
frflags a5
flt.d a0,fa0,fa1
fsflags a5
snez a0,a0
ret
.size flt, .-flt
because the middle end can't see through the UNSPEC operation unordered
FP comparisons have been defined in terms of.
These instructions are only produced via an expander already, so change
the expander to emit individual RTL insns for each machine instruction
in the ultimate ultimate sequence produced rather than deferring to a
single RTL insn producing the whole sequence at once.
gcc/
* config/riscv/riscv.md (UNSPECV_FSNVSNAN): New constant.
(QUIET_PATTERN): New int attribute.
(f<quiet_pattern>_quiet<ANYF:mode><X:mode>4): Emit the intended
RTL insns entirely within the preparation statements.
(*f<quiet_pattern>_quiet<ANYF:mode><X:mode>4_default)
(*f<quiet_pattern>_quiet<ANYF:mode><X:mode>4_snan): Remove
insns.
(*riscv_fsnvsnan<mode>2): New insn.
gcc/testsuite/
* gcc.target/riscv/fle-ieee.c: New test.
* gcc.target/riscv/fle-snan.c: New test.
* gcc.target/riscv/fle.c: New test.
* gcc.target/riscv/flef-ieee.c: New test.
* gcc.target/riscv/flef-snan.c: New test.
* gcc.target/riscv/flef.c: New test.
* gcc.target/riscv/flt-ieee.c: New test.
* gcc.target/riscv/flt-snan.c: New test.
* gcc.target/riscv/flt.c: New test.
* gcc.target/riscv/fltf-ieee.c: New test.
* gcc.target/riscv/fltf-snan.c: New test.
* gcc.target/riscv/fltf.c: New test.
__builtin_unreachable and __ubsan_handle_builtin_unreachable don't
use vops, they are marked const/leaf/noreturn/nothrow/cold.
But __builtin_trap uses vops, isn't const, just leaf/noreturn/nothrow/cold.
This is I believe so that when users explicitly use __builtin_trap in their
sources they get stores visible at the trap side.
-fsanitize=unreachable -fsanitize-undefined-trap-on-error used to transform
__builtin_unreachable to __builtin_trap even in the past, but the sanopt pass
has TODO_update_ssa, so it worked fine.
Now that gimple_build_builtin_unreachable can build a __builtin_trap call
right away, we can run into problems that whenever we need it we would need
to either manually or through TODO_update* ensure the vops being updated.
Though, as it is originally __builtin_unreachable which is just implemented
as trap, I think for this case it is fine to avoid vops. For this the
patch introduces IFN_TRAP, which has ECF_* flags like __builtin_unreachable
and is expanded as __builtin_trap.
2022-07-28 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/106099
* internal-fn.def (TRAP): New internal fn.
* internal-fn.h (expand_TRAP): Declare.
* internal-fn.cc (expand_TRAP): Define.
* gimple.cc (gimple_build_builtin_unreachable): For BUILT_IN_TRAP,
use internal fn rather than builtin.
* gcc.dg/ubsan/pr106099.c: New test.
Shorten the assembly example so that there is not slider.
Ready for master?
Thanks,
Martin
gcc/jit/ChangeLog:
* docs/cp/intro/tutorial02.rst:
Shorten the assembly example so that there is not slider.
* docs/cp/intro/tutorial04.rst: Likewise.
* docs/intro/tutorial02.rst: Likewise.
* docs/intro/tutorial04.rst: Likewise.
* docs/topics/contexts.rst: Likewise.
Use rather list-table that is easible to maintainer and one
does not have to wrap lines. Moreover, it provides great
attribute :widths: that correctly works (tested for HTML and PDF).
gcc/jit/ChangeLog:
* docs/cp/intro/tutorial04.rst: Use list-table.
* docs/intro/tutorial04.rst: Likewise.
* docs/intro/tutorial05.rst: Likewise.
* docs/topics/compilation.rst: Likewise.
* docs/topics/expressions.rst: Likewise.
* docs/topics/types.rst: Likewise.
gcc/jit/ChangeLog:
* docs/cp/intro/tutorial02.rst: Use proper reference.
* docs/cp/topics/contexts.rst: Likewise.
* docs/cp/topics/functions.rst: Put `class` directive before a
function as it is not allowed declaring a class in a fn.
* docs/cp/topics/types.rst: Add template keyword.
* docs/examples/tut04-toyvm/toyvm.c (toyvm_function_compile):
Add removed comment used for code snippet ending detection.
* docs/intro/tutorial04.rst: Fix to match the real comment.
Use expression that work fine for basic type.
gcc/jit/ChangeLog:
* docs/cp/topics/expressions.rst: Use :expr: for basic types.
* docs/topics/compilation.rst: Likewise.
* docs/topics/expressions.rst: Likewise.
* docs/topics/function-pointers.rst: Likewise.
gcc/jit/ChangeLog:
* docs/conf.py: Add needs_sphinx = '3.0' where c:type was added.
* docs/index.rst: Remove note about it.
* docs/topics/compilation.rst: Use enum directive and reference.
* docs/topics/contexts.rst: Likewise.
* docs/topics/expressions.rst: Likewise.
* docs/topics/functions.rst: Likewise.
When preprocessing with -E and -save-temps, input_location points always to the
first character of the current file. This was previously irrelevant because
nothing was called during the token streaming process that would inspect
input_location. But since r13-1544, "#pragma GCC diagnostic" is supported in
preprocess-only mode, and that pragma relies on input_location to decide if a
given source code location is subject to a diagnostic or not. Most diagnostics
work fine anyway, because they are handled as soon as they are seen and so
everything is still seen in the expected order even though all the diagnostic
pragmas are treated as if they applied at the start of the file. One example
that doesn't work correctly is the new testcase, since here the warning is not
triggered until the end of the file and so it is necessary to track the location
properly.
Fixed by setting input_location to point to each token as it is being
streamed, similar to how C++ mode sets it.
gcc/c-family/ChangeLog:
* c-ppoutput.cc (token_streamer::stream): Update input_location
prior to streaming each token.
gcc/testsuite/ChangeLog:
* c-c++-common/pragma-diag-14.c: New test.
* c-c++-common/pragma-diag-15.c: New test.
This patch adds get_meaning_for_state_change vfunc to
fd_diagnostic in sm-fd.cc which could be used by SARIF output.
Lightly tested on x86_64 Linux.
gcc/analyzer/ChangeLog:
PR analyzer/106286
* sm-fd.cc:
(fd_diagnostic::get_meaning_for_state_change): New.
gcc/testsuite/ChangeLog:
PR analyzer/106286
* gcc.dg/analyzer/fd-meaning.c: New test.
Signed-off-by: Immad Mir <mirimmad@outlook.com>