On some targets, there are no < 8191; and >= 8191; strings,
but < 8191) and >= 8191), so just remove the ; from the regexps.
2021-01-01 Jakub Jelinek <jakub@redhat.com>
PR testsuite/98489
PR tree-optimization/56719
* gcc.dg/tree-ssa/pr56719.c: Remove semicolon from
scan-tree-dump-times regexps.
In this testcase we end up with:
unsigned long long x = ...;
char y = (char) (x << 37);
The overwidening pattern realised that only the low 8 bits
of x << 37 are needed, but then tried to turn that into:
unsigned long long x = ...;
char y = (char) x << 37;
which gives an out-of-range shift. In this case y can simply
be replaced by zero, but as the comment in the patch says,
it's kind-of awkward to do that in the middle of vectorisation.
Most of the overwidening stuff is about keeping operations
as narrow as possible, which is important for vectorisation
but could be counter-productive for scalars (especially on
RISC targets). In contrast, optimising y to zero in the above
feels like an independent optimisation that would benefit scalar
code and that should happen before vectorisation.
gcc/
PR tree-optimization/98302
* tree-vect-patterns.c (vect_determine_precisions_from_users): Make
sure that the precision remains greater than the shift count.
gcc/testsuite/
PR tree-optimization/98302
* gcc.dg/vect/pr98302.c: New test.
This PR is about a case in which the vectoriser was feeding
incorrect alignment information to tree-data-ref.c, leading
to incorrect runtime alias checks. The alignment was taken
from the TREE_TYPE of the DR_REF, which in this case was a
COMPONENT_REF with a normally-aligned type. However, the
underlying MEM_REF was only byte-aligned.
This patch uses dr_alignment to calculate the (byte) alignment
instead, just like we do when creating vector MEM_REFs.
gcc/
PR tree-optimization/94994
* tree-vect-data-refs.c (vect_vfa_align): Use dr_alignment.
gcc/testsuite/
PR tree-optimization/94994
* gcc.dg/vect/pr94994.c: New test.
The static GET_MODE_MASKs for SVE vectors are based on the
static precisions, which in turn are based on 128-bit SVE.
The precisions are later updated based on -msve-vector-bits
(usually to become variable length), but the GET_MODE_MASK
stayed the same. This caused combine to fold:
(*_extract:DI (subreg:DI (reg:VNxMM R) 0) ...)
to zero because the extracted bits appeared to be insignificant.
gcc/
PR rtl-optimization/98214
* genmodes.c (emit_insn_modes_h): Emit a definition of CONST_MODE_MASK.
(emit_mode_mask): Treat mode_mask_array as non-constant if adj_nunits.
(emit_mode_adjustments): Update GET_MODE_MASK when updating
GET_MODE_NUNITS.
* machmode.h (mode_mask_array): Use CONST_MODE_MASK.
The following patch adds some clz simplifications. If
clz is 0, then the MSB of the argument is set, and if clz is prec-1, then
the argument is 1.
2020-12-31 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/94802
* match.pd (clz(X) == 0 -> (int)X < 0): New simplification.
(clz(X) == (prec-1) -> X == 1): Likewise.
* gcc.dg/tree-ssa/pr94802-1.c: New test.
The following patch adds two simplifications to recognize idioms
for ABS_EXPR resp. ABSU_EXPR.
2020-12-31 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/94785
* match.pd ((-(X < 0) | 1) * X -> abs (X)): New simplification.
((-(X < 0) | 1U) * X -> absu (X)): Likewise.
* gcc.dg/tree-ssa/pr94785.c: New test.
The following testcase is miscompiled, because niter analysis miscomputes
the number of iterations to 0.
The problem is that niter analysis uses mpz_t (wonder why, wouldn't
widest_int do the same job?) and when wi::to_mpz is called e.g. on the
TYPE_MAX_VALUE of __uint128_t, it initializes the mpz_t result with wrong
value.
wi::to_mpz has code to handle negative wide_ints in signed types by
inverting all bits, importing to mpz and complementing it, which is fine,
but doesn't handle correctly the case when the wide_int's len (times
HOST_BITS_PER_WIDE_INT) is smaller than precision when wi::neg_p.
E.g. the 0xffffffffffffffffffffffffffffffff TYPE_MAX_VALUE is represented
in wide_int as 0xffffffffffffffff len 1, and wi::to_mpz would create
0xffffffffffffffff mpz_t value from that.
This patch handles it by adding the needed -1 host wide int words (and has
also code to deal with precision that aren't multiple of
HOST_BITS_PER_WIDE_INT).
2020-12-31 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/98474
* wide-int.cc (wi::to_mpz): If wide_int has MSB set, but type
is unsigned and excess negative, append set bits after len until
precision.
* gcc.c-torture/execute/pr98474.c: New test.
The following testcase is diagnosed by UBSan as invalid, even when it is
valid.
We have a derived type Base2 at offset 1 with alignment 1 and do:
(const Derived &) ((const Base2 *) this + -1)
but the folder before ubsan in the FE gets a chance to instrument it
optimizes that into:
(const Derived &) this + -1
and so we require that this has 8-byte alignment which Derived class needs.
Fixed by avoiding such an optimization when -fsanitize=alignment is in
effect if it would affect the alignments (and guarded with !in_gimple_form
because we don't really care during GIMPLE, though pointer conversions are
useless then and so such folding isn't needed very much during GIMPLE).
2020-12-31 Jakub Jelinek <jakub@redhat.com>
PR c++/98206
* fold-const.c: Include asan.h.
(fold_unary_loc): Don't optimize (ptr_type) (((ptr_type2) x) p+ y)
into ((ptr_type) x) p+ y if sanitizing alignment in GENERIC and
ptr_type points to type with higher alignment than ptr_type2.
* g++.dg/ubsan/align-4.C: New test.
The following patch adds an optimization mentioned in PR56719 #c8.
We already have the x != 0 && y != 0 && z != 0 into (x | y | z) != 0
and x != -1 && y != -1 && y != -1 into (x & y & z) != -1
optimizations, this patch just extends that to
x < C && y < C && z < C for power of two constants C into
(x | y | z) < C (for unsigned comparisons).
I didn't want to create too many buckets (there can be TYPE_PRECISION such
constants), so the patch instead just uses one buckets for all such
constants and loops over that bucket up to TYPE_PRECISION times.
2020-12-31 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/56719
* tree-ssa-reassoc.c (optimize_range_tests_cmp_bitwise): Also optimize
x < C && y < C && z < C when C is a power of two constant into
(x | y | z) < C.
* gcc.dg/tree-ssa/pr56719.c: New test.
Symbols with extern(D) linkage are now mangled using back references to
types and identifiers if these occur more than once in the mangled name
as emitted before. This reduces symbol length, especially with chained
expressions of templated functions with Voldemort return types.
For example, the average symbol length of the 127000+ symbols created by
a libphobos unittest build is reduced by a factor of about 3, while the
longest symbol shrinks from 416133 to 1142 characters.
Reviewed-on: https://github.com/dlang/dmd/pull/12079
gcc/d/ChangeLog:
* dmd/MERGE: Merge upstream dmd 2bd4fc3fe.
This does not yet include support for the //go:embed directive added
in this release.
* Makefile.am (check-runtime): Don't create check-runtime-dir.
(mostlyclean-local): Don't remove check-runtime-dir.
(check-go-tool, check-vet): Copy in go.mod and modules.txt.
(check-cgo-test, check-carchive-test): Add go.mod file.
* Makefile.in: Regenerate.
Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/280172
There is no need for combine splitters to emit insn patterns with clobbers,
the pass is smart enough to add clobbers to patterns as necessary.
2020-12-30 Uroš Bizjak <ubizjak@gmail.com>
gcc/
* config/i386/i386.md: Remove unnecessary clobbers
from combine splitters.
The implementation in d-lang.cc was based on what was present in libcpp.
This synchronizes the escaping logic to match the current version.
gcc/d/ChangeLog:
* d-lang.cc (deps_add_target): Handle quoting ':' character.
Reimplement backslash tracking.
CST trees that were converted back to a D front-end AST node lost all
location information of the original expression. Now this is propagated
on to the literal expression.
gcc/d/ChangeLog:
* d-tree.h (d_eval_constant_expression): Add location argument.
* d-builtins.cc (d_eval_constant_expression): Give generated constants
a proper file location.
* d-compiler.cc (Compiler::paintAsType): Pass expression location to
d_eval_constant_expression.
* d-frontend.cc (eval_builtin): Likewise.
The following patch adds combine splitters to optimize:
- vpcmpeqd %ymm1, %ymm1, %ymm1
- vpandn %ymm1, %ymm0, %ymm0
vpmovmskb %ymm0, %eax
+ notl %eax
etc. (for vectors with less than 32 elements with xorl instead of notl).
2020-12-30 Jakub Jelinek <jakub@redhat.com>
PR target/98461
* config/i386/sse.md (<sse2_avx2>_pmovmskb): Add splitters
for pmovmskb of NOT vector.
* gcc.target/i386/sse2-pr98461.c: New test.
* gcc.target/i386/avx2-pr98461.c: New test.
2020-12-29 Paul Thomas <pault@gcc.gnu.org>
gcc/fortran
PR fortran/97612
* primary.c (build_actual_constructor): Missing allocatable
components are set unallocated using EXPR_NULL. Then missing
components are tested for a default initializer.
gcc/testsuite/
PR fortran/97612
* gfortran.dg/structure_constructor_17.f90: New test.
2020-12-29 Paul Thomas <pault@gcc.gnu.org>
gcc/fortran
PR fortran/93833
* trans-array.c (get_array_ctor_var_strlen): If the character
length backend_decl cannot be found, convert the expression and
use the string length. Clear up some minor white space issues
in the rest of the file.
gcc/testsuite/
PR fortran/93833
* gfortran.dg/deferred_character_36.f90 : New test.
The ARC code contains code which should only work with the old reload
pass. Such code is found in arc_secondary_reload hook, however it was
not properly quarded. Reverse the if-condition predicate such that
req_equiv_mem is called when lra is not in progress.
gcc/
2020-12-29 Claudiu Zissulescu <claziss@synopsys.com>
* config/arc/arc.c (arc_secondary_reload): Flip if-condition
predicates.
Signed-off-by: Claudiu Zissulescu <claziss@synopsys.com>
The REGNO_OK_FOR_BASE_P is using reg_renumber array. However, it is
not always defined. Use it only when it is defined.
gcc/
2020-12-29 Claudiu Zissulescu <claziss@synopsys.com>
* config/arc/arc.h (REGNO_OK_FOR_BASE_P): Check if defined
reg_renumber.
Signed-off-by: Claudiu Zissulescu <claziss@synopsys.com>
We need an temporary register when moving data from a cached memory to
an uncached memory. Fix this issue and add a test for it.
gcc/
2020-12-29 Claudiu Zissulescu <claziss@synopsys.com>
* config/arc/arc.c (prepare_move_operands): Use a temporary
registers when we have cached mem-to-uncached mem moves.
gcc/testsuite/
2020-12-29 Vladimir Isaev <isaev@synopsys.com>
* gcc.target/arc/uncached-9.c: New test.
Signed-off-by: Claudiu Zissulescu <claziss@synopsys.com>
Update movdi, movdf and mov vectors not to use predicated vadd2
instructions. vadd2 is used as a "fast" move in these patterns. This
fixes a number of failures in dejagnu.
gcc/
2020-12-29 Claudiu Zissulescu <claziss@synopsys.com>
* config/arc/arc.md (movdi_insn): Update pattern, no predicated
vadd2 usage.
(movdf_insn): Likewise.
* config/arc/simdext.md (movVEC_insn): Likewise.
Signed-off-by: Claudiu Zissulescu <claziss@synopsys.com>
Use copy_to_reg where appropriate, use int_mode_for_mode
and fix comment indentation.
2020-12-29 Uroš Bizjak <ubizjak@gmail.com>
gcc/
* config/i386/i386-expand.c (ix86_gen_TWO52): Use REAL_MODE_FORMAT
to determine number of mantissa bits. Use real_2expN instead
of real_ldexp.
(ix86_expand_rint): Use copy_to_reg.
(ix86_expand_floorceildf_32): Ditto.
(ix86_expand_truncdf_32): Ditto.
(ix86_expand_rounddf_32): Ditto.
(ix86_expand_floorceil): Use copy_to_reg and int_mode_for_mode.
(ix86_expand_trunc): Ditto.
(ix86_expand_round): Ditto.
The libgomp texinfo docs lead to an invalid "up" link on the Top node,
which we can avoid similarly to the Top link in the main GCC manual.
2020-12-28 Sandra Loosemore <sandra@codesourcery.com>
libgomp/
* libgomp.texi (Top): Avoid bad "up" link.
Support for HSAIL has been deprecated with GCC 10 and their web server
has been down for weeks.
gcc/
2020-12-28 Gerald Pfeifer <gerald@pfeifer.com>
* doc/standards.texi (HSAIL): Remove section.
x86_expand_rint expander uses x86_sse_copysign_to_positive, which
is unable to change the sign from - to +. When FE_DOWNWARD rounding
direction is in effect, the expanded sequence that involves subtraction
can trigger x - x = -0.0 special rule. x86_sse_copysign_to_positive
fails to change the sign of the intermediate value, assumed to always
be positive, back to positive.
The patch adds one extra fabs that strips the sign from the intermediate
value when flag_rounding_math is in effect.
2020-12-28 Uroš Bizjak <ubizjak@gmail.com>
gcc/
PR target/96793
* config/i386/i386-expand.c (ix86_expand_rint):
Remove the sign of the intermediate value for flag_rounding_math.
gcc/testsuite/
PR target/96793
* gcc.target/i386/pr96793-2.c: New test.
It is possible to avoid the call to force_reg and use existing
temporary register in ix86_expand_trunc, ix86_expand_round and
ix86_expand_rounddf_32 expanders.
2020-12-28 Uroš Bizjak <ubizjak@gmail.com>
gcc/
* config/i386/i386-expand.c (ix86_expand_trunc): Use
existing temporary register to avoid a call to force_reg.
gcc:
2020-12-27 Gerald Pfeifer <gerald@pfeifer.com>
* doc/analyzer.texi (Analyzer Internals): Find a new source for
the "A Memory Model for Static Analysis of C Programs" paper.
2020-12-27 Paul Thomas <pault@gcc.gnu.org>
gcc/fortran
PR fortran/97694
PR fortran/97723
* check.c (allocatable_check): Select rank temporaries are
permitted even though they are treated as associate variables.
* resolve.c (gfc_resolve_code): Break on select rank as well as
select type so that the block os resolved.
* trans-stmt.c (trans_associate_var): Class associate variables
that are optional dummies must use the backend_decl.
gcc/testsuite/
PR fortran/97694
PR fortran/97723
* gfortran.dg/select_rank_5.f90: New test.
2020-12-26 Paul Thomas <pault@gcc.gnu.org>
gcc/fortran
PR fortran/98022
* data.c (gfc_assign_data_value): Throw an error for inquiry
references. Follow with corrected code that would provide the
expected result and provides clean error recovery.
gcc/testsuite/
PR fortran/98022
* gfortran.dg/data_inquiry_ref.f90: Change to dg-compile and
add errors for inquiry references.
2020-12-23 Paul Thomas <pault@gcc.gnu.org>
gcc/fortran
PR fortran/83118
* trans-array.c (gfc_alloc_allocatable_for_assignment): Make
sure that class expressions are captured for dummy arguments by
use of gfc_get_class_from_gfc_expr otherwise the wrong vptr is
used.
* trans-expr.c (gfc_get_class_from_gfc_expr): New function.
(gfc_get_class_from_expr): If a constant expression is
encountered, return NULL_TREE;
(gfc_trans_assignment_1): Deallocate rhs allocatable components
after passing derived type function results to class lhs.
* trans.h : Add prototype for gfc_get_class_from_gfc_expr.
libstdc++-v3:
2020-12-26 Gerald Pfeifer <gerald@pfeifer.com>
* doc/xml/manual/abi.xml: Update link to Intel's compatibility
with GNU compilers document.
* doc/html/manual/abi.html: Regenerate.
github.com requests (via 301 Moved Permanently) to use gibhub.com,
not www.github.com.
gcc/ChangeLog:
2020-12-25 Gerald Pfeifer <gerald@pfeifer.com>
* doc/invoke.texi (C++ Module Mapper): Fix reference to libcody.
Fix handling of F2018 enhancements to DATA statements that allows
initialization of pointer components to derived types, and adjust error
handling for the CHARACTER case.
gcc/fortran/ChangeLog:
* data.c (gfc_assign_data_value): Restrict use of
create_character_initializer to constant initializers.
* trans-expr.c (gfc_conv_initializer): Ensure that character
initializer is constant, otherwise fall through to get the same
error handling as for non-character cases.
gcc/testsuite/ChangeLog:
* gfortran.dg/pr93685_1.f90: New test.
* gfortran.dg/pr93685_2.f90: New test.
This option allows the user to specifiy alternate C++ runtime libraries,
for example when a platform uses libc++ as the installed C++ runtime.
We introduce the command line option: -stdlib= which is the user-facing
mechanism to select the C++ runtime to be used when compiling and linking
code. This is the same option spelling as that used by clang to allow the
use of libstdc++.
The availability (and thus function) of the option are a configure-time
choice using the configuration control:
--with-gxx-libcxx-include-dir=
Specification of the path for the libc++ headers, enables the -stdlib=
option (using the path as given), default values are set when the path
is unconfigured.
If --with-gxx-libcxx-include-dir is given together with --with-sysroot=,
then we test to see if the include path starts with the sysroot and, if so,
record the sysroot-relative component as the local path. At runtime, we
prepend the sysroot that is actually active.
At link time, we use the C++ runtime in force and (if that is libc++) also
append the libc++abi ABI library. As for other cases, if a target sets the
name pointer for the ABI library to NULL the G++ driver will omit it from
the link line.
gcc/ChangeLog:
* configure.ac: Add gxx-libcxx-include-dir handled
in the same way as the regular cxx header directory.
* Makefile.in: Regenerated.
* config.in: Likewise.
* configure: Likewise.
* cppdefault.c: Pick up libc++ headers if the option
is enabled.
* cppdefault.h (struct default_include): Amend comments
to reflect the extended use of the cplusplus field.
* incpath.c (add_standard_paths): Allow for multiple
c++ header include path variants.
* doc/invoke.texi: Document the -stdlib= option.
gcc/c-family/ChangeLog:
* c.opt: Add -stdlib= option and enumerations for
libstdc++ and libc++.
gcc/cp/ChangeLog:
* g++spec.c (LIBCXX, LIBCXX_PROFILE, LIBCXX_STATIC): New.
(LIBCXXABI, LIBCXXABI_PROFILE, LIBCXXABI_STATIC): New.
(enum stdcxxlib_kind): New.
(lang_specific_driver): Allow selection amongst multiple
c++ runtime libraries.