Commit Graph

179928 Commits

Author SHA1 Message Date
Jonathan Wakely 2137aa9241 libstdc++: Replace use of reserved name that clashes [PR 97362]
The name __deref is defined as a macro by Windows headers.

This renames the __deref() helper function to __ref. It doesn't actually
dereference an iterator. it just has the same type as the iterator's
reference type.

libstdc++-v3/ChangeLog:

	PR libstdc++/97362
	* doc/html/manual/source_code_style.html: Regenerate.
	* doc/xml/manual/appendix_contributing.xml: Add __deref to
	BADNAMES.
	* include/debug/functions.h (_Irreflexive_checker::__deref):
	Rename to __ref.
	* testsuite/17_intro/badnames.cc: Check __deref.
2020-10-10 21:22:12 +01:00
Jan Hubicka 988f0466e8 Fix ICE in remap_arguments with removed parameters.
* ipa-modref.c (remap_arguments): Check range in map access.
2020-10-10 22:16:59 +02:00
Jan Hubicka 5d2cedaaa3 Fix modref_transform ICE with more than 32 parameters.
* ipa-modref.c (modref_transform): Use reserve instead of safe_grow.
2020-10-10 22:01:17 +02:00
Jan Hubicka 6a6c85f4e1 Fix ipa-modref ICE with not allocated summaries.
* ipa-modref.c (modref_transform): Check that summaries are allocated.
2020-10-10 21:22:52 +02:00
Jan Hubicka c8fd2be174 Fix modref handling of parameter adjustments and jump functions.
* ipa-modref-tree.h (struct modref_tree): Revert prevoius change.
	* ipa-modref.c (analyze_function): Dump original summary.
	(modref_read): Only set IPA if streaming summary (not optimization
	summary).
	(remap_arguments): New function.
	(modref_transform): New function.
	(compute_parm_map): Fix offset calculation.
	(ipa_merge_modref_summary_after_inlining): Do not merge stores when
	they can be ignored.
2020-10-10 20:55:37 +02:00
Jan Hubicka f1f1008c7c Improve tree-ssa-alias dump files.
* tree-ssa-alias.c (ref_maybe_used_by_call_p_1): Improve debug dumps.
	(call_may_clobber_ref_p_1): Improve debug dumps.
2020-10-10 19:36:03 +02:00
Iain Sandoe c28d91bf23 Objective-C, Darwin : Pick up super refs directly.
The current code assumed that super refs could be computed
indirectly, i.e. that the metadata generated by the compiler
was immutable by the runtime. This does not always hold
(it depends on the NeXT runtime version).  So, compute super
refs directly.

gcc/objc/ChangeLog:

	* objc-next-runtime-abi-02.c
	(objc_get_superclass_ref_decl): Split this code out.
	(next_runtime_abi_02_get_class_super_ref): Compute
	super refs using the objc_get_superclass_ref_decl().
	(next_runtime_abi_02_get_category_super_ref): Likewise.
2020-10-10 17:28:04 +01:00
Iain Sandoe bb675539ba Darwin : Only emit Objective-C section switches for older linkers.
At one time, the system linkers needed to have at least a dummy
entry for every Objective-C section in use.  This removes the extra
emitted code when it is not needed by the linker.

gcc/ChangeLog:

	* config/darwin.c (output_objc_section_asm_op): Avoid extra
	objective-c section switches unless the linker needs them.
2020-10-10 17:23:10 +01:00
Iain Sandoe ecd616f680 Objective-C, Darwin : Update metadata section uses.
Newer versions of ld64 are more picky about adherence to placement
rules for objective c metadata.  This adds protocol refs and uses
the ivar refs for all targets.

gcc/ChangeLog:

	* config/darwin-sections.def (objc2_data_section): New.
	(objc2_ivar_section): New.
	* config/darwin.c (darwin_objc2_section): Act on Protocol and
	ivar refs.

gcc/objc/ChangeLog:

	* objc-next-runtime-abi-02.c
	(next_runtime_abi_02_init_metadata_attributes): Make protocol
	refs a distinct section.
2020-10-10 17:13:21 +01:00
Iain Sandoe a788c4555c Objective-C, Darwin : Use special string sections for V2 NeXT runtime.
Newer versions of the runtime expect to find strings for class, method
and method types in set-aside sections rather than the general c_strings
one.

gcc/ChangeLog:

	* config/darwin-sections.def (objc2_class_names_section,
	objc2_method_names_section, objc2_method_types_section): New
	* config/darwin.c (output_objc_section_asm_op): Output new
	sections.  (darwin_objc2_section): Select new sections where
	used.

gcc/objc/ChangeLog:

	* objc-next-runtime-abi-02.c
	(next_runtime_abi_02_init_metadata_attributes): Attach metadata
	for the special string sections to class, method and method type
	string sections.
2020-10-10 17:04:45 +01:00
Iain Sandoe 900c0ca226 Objective-C: Addess a FIXME (NFC).
This removes references to the next runtime from the gnu runtime
implementation.

gcc/objc/ChangeLog:

	* objc-gnu-runtime-abi-01.c
	(build_shared_structure_initializer): Remove references to
	the NeXT runtime.
	(generate_static_references): Likewise.
2020-10-10 16:55:57 +01:00
Iain Sandoe dcf59c5c01 Darwin : Begin rework of zero-fill sections.
Much of the existing work in the Darwin BSS and common sections
was to accommodate the PowerPC section anchors.  We want to segregate
this, since it might become desirable to support section anchors for
arm64.

First revision (here) is to use the same section conventions as the Xcode
toochains for BSS and COMMON.

We also drop the constraint about putting small items into data/static data
that was a work-around for Java issues (irrelevant for several editions).

gcc/ChangeLog:

	* config/darwin.c (darwin_emit_local_bss): Amend section names to
	match system tools. (darwin_output_aligned_bss): Likewise.

gcc/testsuite/ChangeLog:

	* gcc.dg/darwin-sections.c: Adjust test for renamed BSS and common
	sections.  Cater for 64 and 128 bit long doubles.
2020-10-10 16:45:32 +01:00
H.J. Lu 16664e6e4f x86-64: Check CMPXCHG16B for x86-64-v[234]
x86-64-v2 includes CMPXCHG16B.  Since -mcx16 enables CMPXCHG16B and
defines __GCC_HAVE_SYNC_COMPARE_AND_SWAP_16, check it in x86-64-v[234]
tests.

	PR target/97250
	* gcc.target/i386/x86-64-v2.c: Verify that
	__GCC_HAVE_SYNC_COMPARE_AND_SWAP_16 is defined.
	* gcc.target/i386/x86-64-v3.c: Likewise.
	* gcc.target/i386/x86-64-v4.c: Likewise.
2020-10-10 05:28:19 -07:00
Ville Voutilainen 02cbd79e47 libstdc++: Fix variant build on 32-bit targets [PR95904]
libstdc++-v3/ChangeLog:

	* include/std/variant (__check_visitor_result):
	Use size_t for indexes.
	(__check_visitor_results): Likewise.
2020-10-10 14:03:00 +03:00
Aldy Hernandez 14db1dfcd1 PR97359: Do not cache relops in GORI cache.
logical_stmt_cache::cacheable_p() returns true for relops, but
logical_combine (which does the caching) doesn't handle them and ICEs.
This patch fixes the inconsistency by returning false for relops.

This was working before because even though logical_combine doesn't
handle relops, statements with only one SSA are handled in cache_stmt,
which seems like the only statement we've ever encountered (even through
a full Fedora build).

	lhs = s_5 > 999;

However, with two SSA operands we ICE because logical_combine doesn't
handle them:

	lhs = s_5 > y_8;

We can either return false for relops in cacheable_p, or fix
logical_combine to handle them.  The original idea was to only cache
ANDs and ORs, so I've done the former to unbreak trunk.

We can decide later if there was ever any benefit in caching relops.

gcc/ChangeLog:

	PR tree-optimization/97359
	* gimple-range-gori.cc (logical_stmt_cache::cacheable_p): Only
	handle ANDs and ORs.
	(gori_compute_cache::cache_stmt): Adjust comment.

gcc/testsuite/ChangeLog:

	* gcc.dg/pr97359.c: New test.
2020-10-10 10:26:30 +02:00
GCC Administrator c74a0e82fa Daily bump. 2020-10-10 00:16:26 +00:00
Ville Voutilainen 3427e31331 libstdc++: Diagnose visitors with different return types [PR95904]
libstdc++-v3/ChangeLog:

	PR libstdc++/95904
	* include/std/variant (__deduce_visit_result): Add a nested ::type.
	(__gen_vtable_impl</*base case*/>::_S_apply):
	Check the visitor return type.
	(__same_types): New.
	(__check_visitor_result): Likewise.
	(__check_visitor_results): Likewise.
	(visit(_Visitor&&, _Variants&&...)): Use __check_visitor_results
	in case we're visiting just one variant.
	* testsuite/20_util/variant/visit_neg.cc: Adjust.
2020-10-09 20:48:08 +03:00
Jonathan Wakely 3ee44d4c51 libstdc++: Fix incorrect results in std::seed_seq::generate [PR 97311]
This ensures that intermediate results are done in uint32_t values,
meeting the requirement for operations to be done modulo 2^32.

If the target doesn't define __UINT32_TYPE__ then substitute uint32_t
with a class type that uses uint_least32_t and masks the value to
UINT32_MAX.

I've also split the first loop that goes from k=0 to k<m into three
loops, for k=0, [1,s] and [s+1,m). This avoids branching for those three
cases in the body of the loop, and also avoids the concerns in PR 94823
regarding the k-1 index when k==0.

libstdc++-v3/ChangeLog:

	PR libstdc++/97311
	* include/bits/random.tcc (seed_seq::generate): Use uint32_t for
	calculations. Also split the first loop into three loops to
	avoid branching on k on every iteration, resolving PR 94823.
	* testsuite/26_numerics/random/seed_seq/97311.cc: New test.
	* testsuite/26_numerics/random/pr60037-neg.cc: Adjust dg-erro
	line number.
2020-10-09 16:58:32 +01:00
Vladimir N. Makarov bb37ad8cc0 Don't keep strict_low_part in reloads for non-registers. [PR97313]
gcc/ChangeLog:

2020-10-09  Vladimir Makarov  <vmakarov@redhat.com>

	PR rtl-optimization/97313
	* lra-constraints.c (match_reload): Don't keep strict_low_part in
	reloads for non-registers.

gcc/testsuite/ChangeLog:

2020-10-09  Vladimir Makarov  <vmakarov@redhat.com>

	PR rtl-optimization/97313
	* gcc.target/i386/pr97313.c: New.
2020-10-09 10:12:42 -04:00
Daniel Lemire 98c37d3bac libstdc++: Optimize uniform_int_distribution using Lemire's algorithm
Co-authored-by: Jonathan Wakely <jwakely@redhat.com>

libstdc++-v3/ChangeLog:

	* include/bits/uniform_int_dist.h (uniform_int_distribution::_S_nd):
	New member function implementing Lemire's "nearly divisionless"
	algorithm.
	(uniform_int_distribution::operator()): Use _S_nd when the range
	of the URBG is the full width of the result type.
2020-10-09 14:09:36 +01:00
Jonathan Wakely 6ce2cb116a libstdc++: Adjust variable export in makefile
We usually export variables in recipes this way. I'm not sure it's
necessary, but it's consistent.

libstdc++-v3/ChangeLog:

	* testsuite/Makefile.am: Set and export variable separately.
	* testsuite/Makefile.in: Regenerate.
2020-10-09 14:08:42 +01:00
Jonathan Wakely 7e7eef2a1b libstdc++: Pass CXXFLAGS to check_performance script
It looks like our check-performance target runs completely unoptimized,
which is a bit silly. This exports the CXXFLAGS from the parent make
process to the check_performance script.

libstdc++-v3/ChangeLog:

	* scripts/check_performance: Use gnu++11 instead of gnu++0x.
	* testsuite/Makefile.am (check-performance): Export CXXFLAGS to
	child process.
	* testsuite/Makefile.in: Regenerate.
2020-10-09 14:01:55 +01:00
Jonathan Wakely f9919ba717 libstdc++: Add performance test for <random>
This tests std::uniform_int_distribution with various parameters and
engines.

libstdc++-v3/ChangeLog:

	* testsuite/performance/26_numerics/random_dist.cc: New test.
2020-10-09 14:01:54 +01:00
H.J. Lu 59a95143dd x86: Add <x86gprintrin.h>
For sources which can't use any vector instructions, <x86intrin.h> and
<immintrin.h> cannot be included for compiler intrinsics:

$ echo "#include <x86intrin.h>" | gcc -S -O2 -mno-sse -mno-mmx -x c -
In file included from /usr/include/stdlib.h:1013,
                 from /usr/lib/gcc/x86_64-redhat-linux/10/include/mm_malloc.h:27,
                 from /usr/lib/gcc/x86_64-redhat-linux/10/include/xmmintrin.h:34,
                 from /usr/lib/gcc/x86_64-redhat-linux/10/include/immintrin.h:29,
                 from /usr/lib/gcc/x86_64-redhat-linux/10/include/x86intrin.h:32,
                 from <stdin>:1:
/usr/include/bits/stdlib-float.h: In function ‘atof’:
/usr/include/bits/stdlib-float.h:26:1: error: SSE register return with SSE disabled
   26 | {
      | ^
$

libgcc/config/i386/shadow-stack-unwind.h has a workaround:

/* NB: We need _get_ssp and _inc_ssp from <cetintrin.h>.  But we can't
   include <x86intrin.h> which ends up including <mm_malloc.h>, which
   includes <stdlib.h> and <errno.h> unconditionally.  But we can't
   include any libc system headers unconditionally from libgcc.  Avoid
   including <mm_malloc.h> here by defining _IMMINTRIN_H_INCLUDED.  */
 #define _IMMINTRIN_H_INCLUDED
 #include <cetintrin.h>
 #undef _IMMINTRIN_H_INCLUDED

Add a standalone intrinsic header file, <x86gprintrin.h>, to provide
integer only intrinsics.  All integer only intrinsics are placed in
<x86gprintrin.h>.  <x86intrin.h> and <immintrin.h> simply include
<x86gprintrin.h>.

gcc/

	PR target/97148
	* config.gcc (extra_headers): Add x86gprintrin.h.
	* config/i386/adxintrin.h: Check _X86GPRINTRIN_H_INCLUDED for
	<x86gprintrin.h>.
	* config/i386/bmi2intrin.h: Likewise.
	* config/i386/bmiintrin.h: Likewise.
	* config/i386/cetintrin.h: Likewise.
	* config/i386/cldemoteintrin.h: Likewise.
	* config/i386/clflushoptintrin.h: Likewise.
	* config/i386/clwbintrin.h: Likewise.
	* config/i386/enqcmdintrin.h: Likewise.
	* config/i386/fxsrintrin.h: Likewise.
	* config/i386/ia32intrin.h: Likewise.
	* config/i386/lwpintrin.h: Likewise.
	* config/i386/lzcntintrin.h: Likewise.
	* config/i386/movdirintrin.h: Likewise.
	* config/i386/pconfigintrin.h: Likewise.
	* config/i386/pkuintrin.h: Likewise.
	* config/i386/rdseedintrin.h: Likewise.
	* config/i386/rtmintrin.h: Likewise.
	* config/i386/serializeintrin.h: Likewise.
	* config/i386/tbmintrin.h: Likewise.
	* config/i386/tsxldtrkintrin.h: Likewise.
	* config/i386/waitpkgintrin.h: Likewise.
	* config/i386/wbnoinvdintrin.h: Likewise.
	* config/i386/xsavecintrin.h: Likewise.
	* config/i386/xsaveintrin.h: Likewise.
	* config/i386/xsaveoptintrin.h: Likewise.
	* config/i386/xsavesintrin.h: Likewise.
	* config/i386/xtestintrin.h: Likewise.
	* config/i386/immintrin.h: Include <x86gprintrin.h> instead of
	<fxsrintrin.h>, <xsaveintrin.h>, <xsaveoptintrin.h>,
	<xsavesintrin.h>, <xsavecintrin.h>, <lzcntintrin.h>,
	<bmiintrin.h>, <bmi2intrin.h>, <xtestintrin.h>, <cetintrin.h>,
	<movdirintrin.h>, <sgxintrin.h, <pconfigintrin.h>,
	<waitpkgintrin.h>, <cldemoteintrin.h>, <enqcmdintrin.h>,
	<serializeintrin.h>, <tsxldtrkintrin.h>, <adxintrin.h>,
	<clwbintrin.h>, <clflushoptintrin.h>, <wbnoinvdintrin.h> and
	<pkuintrin.h>.
	(_wbinvd): Moved to config/i386/x86gprintrin.h.
	(_rdrand16_step): Likewise.
	(_rdrand32_step): Likewise.
	(_rdpid_u32): Likewise.
	(_readfsbase_u32): Likewise.
	(_readfsbase_u64): Likewise.
	(_readgsbase_u32): Likewise.
	(_readgsbase_u64): Likewise.
	(_writefsbase_u32): Likewise.
	(_writefsbase_u64): Likewise.
	(_writegsbase_u32): Likewise.
	(_writegsbase_u64): Likewise.
	(_rdrand64_step): Likewise.
	(_ptwrite64): Likewise.
	(_ptwrite32): Likewise.
	* config/i386/x86gprintrin.h: New file.
	* config/i386/x86intrin.h: Include <x86gprintrin.h>.  Don't
	include <ia32intrin.h>, <lwpintrin.h>, <tbmintrin.h>,
	<popcntintrin.h>, <mwaitxintrin.h> and <clzerointrin.h>.

gcc/testsuite/

	* gcc.target/i386/avx-1.c (__builtin_ia32_lwpval32): New to
	support <lwpintrin.h> included in <x86gprintrin.h>.
	(__builtin_ia32_lwpval64): Likewise.
	(__builtin_ia32_lwpins32): Likewise.
	(__builtin_ia32_lwpins64): Likewise.
	(__builtin_ia32_bextri_u32): New to support <tbmintrin.h>
	included in <x86gprintrin.h>.
	(__builtin_ia32_bextri_u64): Likewise.
	* gcc.target/i386/x86gprintrin-1.c: New test.
	* gcc.target/i386/x86gprintrin-2.c: Likewise.
	* gcc.target/i386/x86gprintrin-3.c: Likewise.
	* gcc.target/i386/x86gprintrin-4.c: Likewise.
	* gcc.target/i386/x86gprintrin-4a.c: Likewise.
	* gcc.target/i386/x86gprintrin-5.c: Likewise.
	* gcc.target/i386/x86gprintrin-5a.c: Likewise.
	* gcc.target/i386/x86gprintrin-5b.c: Likewise.
	* gcc.target/i386/x86gprintrin-6.c: Likewise.

libgcc/

	PR target/97148
	* config/i386/shadow-stack-unwind.h: Include <x86gprintrin.h>
	instead of <cetintrin.h>.
2020-10-09 05:08:41 -07:00
Tom de Vries 383400a607 [nvptx] Set -misa=sm_35 by default
The nvptx-as assembler verifies the ptx code using ptxas, if there's any
in the PATH.

The default in the nvptx port for -misa=sm_xx is sm_30, but the ptxas of the
latest cuda release (11.1) no longer supports sm_30.

Consequently we cannot build gcc against that release (although we should
still be able to build without any cuda release).

Fix this by setting -misa=sm_35 by default.

Tested check-gcc on nvptx.

Tested libgomp on x86_64-linux with nvpx accelerator.

Both build again cuda 9.1.

gcc/ChangeLog:

2020-10-09  Tom de Vries  <tdevries@suse.de>

	PR target/97348
	* config/nvptx/nvptx.h (ASM_SPEC): Also pass -m to nvptx-as if
	default is used.
	* config/nvptx/nvptx.opt (misa): Init with PTX_ISA_SM35.
2020-10-09 13:55:08 +02:00
Richard Biener 8c26cfc6af Fixup gcc.dg/vect/pr65947-3.c when masked loads are available
The following adds a effective target to properly allow
the gcc.dg/vect/pr65947-3.c expected vectorization to be adjusted
when run with, say, -march=cascadelake.

2020-10-09  Richard Biener  <rguenther@suse.de>

gcc/
	* doc/sourcebuild.texi (vect_masked_load): Document.

gcc/testsuite
	* lib/target-supports.exp (check_effective_target_vect_masked_load):
	New effective target.
	* gcc.dg/vect/pr65947-3.c: Update.
2020-10-09 13:18:47 +02:00
Richard Biener 16760e5bf7 tree-optimization/97334 - improve BB SLP discovery
We're running into a multiplication with one unvectorizable
operand we expect to build from scalars but SLP discovery
fatally fails the build of both since one stmt is commutated:

  _60 = _58 * _59;
  _63 = _59 * _62;
  _66 = _59 * _65;
...

where _59 is the "bad" operand.  The following patch makes the
case work where the first stmt has a good operand by not fatally
failing the SLP build for the operand but communicating upwards
how to commutate.

2020-10-09  Richard Biener  <rguenther@suse.de>

	PR tree-optimization/97334
	* tree-vect-slp.c (vect_build_slp_tree_1): Do not fatally
	fail lanes other than zero when BB vectorizing.

	* gcc.dg/vect/bb-slp-pr65935.c: Amend.
2020-10-09 13:15:10 +02:00
Jonathan Wakely afcbeb35e0 libstdc++: Fix unused variable warning
libstdc++-v3/ChangeLog:

	* testsuite/util/testsuite_performance.h (report_header): Remove
	unused variable.
2020-10-09 11:53:08 +01:00
Jan Hubicka ffe8baa996 IPA modref: fix miscompilation in clone when IPA modref is used
gcc/ChangeLog:

	PR ipa/97292
	PR ipa/97335
	* ipa-modref-tree.h (copy_from): Drop summary in a
	clone.
2020-10-09 12:27:23 +02:00
Richard Biener 5d708c6315 tree-optimization/97347 - fix another SLP constant insertion issue
Just use edge insertion which will appropriately handle the situation
from botan.

2020-10-09  Richard Biener  <rguenther@suse.de>

	PR tree-optimization/97347
	* tree-vect-slp.c (vect_create_constant_vectors): Use
	edge insertion when inserting on the fallthru edge,
	appropriately insert at the start of BBs when inserting
	after PHIs.

	* g++.dg/vect/pr97347.cc: New testcase.
2020-10-09 11:28:15 +02:00
Andrew MacLeod 1cde5d85be Fix for PR97317.
gcc/ChangeLog:

	PR tree-optimization/97317
	* range-op.cc (operator_cast::op1_range): Handle casts where the precision
	of the RHS is only 1 greater than the precision of the LHS.

gcc/testsuite/ChangeLog:
	* gcc.dg/pr97317.c: New test.
2020-10-09 10:57:51 +02:00
Richard Biener a0e6e49dde random memory leak fixes
This fixes leaks discovered checking whether I introduced new ones
with the last vectorizer changes.

2020-10-09  Richard Biener  <rguenther@suse.de>

	* cgraphunit.c (expand_all_functions): Free tp_first_run_order.
	* ipa-modref.c (pass_ipa_modref::execute): Free order.
	* tree-ssa-loop-niter.c (estimate_numbers_of_iterations): Free
	loop body.
	* tree-vect-data-refs.c (vect_find_stmt_data_reference): Free
	data references upon failure.
	* tree-vect-loop.c (update_epilogue_loop_vinfo): Free BBs
	array of the original loop.
	* tree-vect-slp.c (vect_slp_bbs): Use an auto_vec for
	dataref_groups to release its memory.
2020-10-09 10:40:44 +02:00
Jakub Jelinek 781634daea vrp: Fix up gcc.target/aarch64/pr90838.c [PR97312, PR94801]
> Perhaps another way out of this would be document and enforce that
> __builtin_c[lt]z{,l,ll} etc calls are undefined at zero, but C[TL]Z ifn
> calls are defined there based on *_DEFINED_VALUE_AT_ZERO (*) == 2

The following patch implements that, i.e. __builtin_c?z* now take full
advantage of them being UB at zero, while the ifns are well defined at zero
if *_DEFINED_VALUE_AT_ZERO (*) == 2.  That is what fixes PR94801.

Furthermore, to fix PR97312, if it is well defined at zero and the value at
zero is prec, we don't lower the maximum unless the argument is known to be
non-zero.
For gimple-range.cc I guess we could improve it if needed e.g. by returning
a [0,7][32,32] range for .CTZ of e.g. [0,137], but for now it (roughly)
matches what vr-values.c does.

2020-10-09  Jakub Jelinek  <jakub@redhat.com>

	PR tree-optimization/94801
	PR target/97312
	* vr-values.c (vr_values::extract_range_basic) <CASE_CFN_CLZ,
	CASE_CFN_CTZ>: When stmt is not an internal-fn call or
	C?Z_DEFINED_VALUE_AT_ZERO is not 2, assume argument is not zero
	and thus use [0, prec-1] range unless it can be further improved.
	For CTZ, don't update maxi from upper bound if it was previously prec.
	* gimple-range.cc (gimple_ranger::range_of_builtin_call) <CASE_CFN_CLZ,
	CASE_CFN_CTZ>: Likewise.

	* gcc.dg/tree-ssa/pr94801.c: New test.
2020-10-09 10:19:16 +02:00
Jakub Jelinek 600cf1128e match.pd: Fix up FFS -> CTZ + 1 optimization [PR97325]
And no testcase was included, I'm including one below.

Anyway, this PR and the other CTZ related discussions led me to discover a
bug I've made earlier, CLZ/CTZ builtins have unsigned arguments and e.g.
both the vr-values.cc and now gimple-range.cc code heavily relies on that,
but __builtin_ffs has a signed operand and this optimization was incorrectly
making the operand signed too, so I guess it would greatly confuse VRP in
some cases.

2020-10-09  Jakub Jelinek  <jakub@redhat.com>

	PR tree-optimization/97325
	* match.pd (FFS(nonzero) -> CTZ(nonzero) + 1): Cast argument to
	corresponding unsigned type.

	* gcc.c-torture/execute/pr97325.c: New test.
2020-10-09 10:18:41 +02:00
Aldy Hernandez 6887244269 Move pr97315-1.c test to g++.dg/opt/.
gcc/testsuite/ChangeLog:

	PR testsuite/97337
	* gcc.dg/pr97315-1.c: Moved to...
	* g++.dg/opt/pr97315-1.C: ...here.
2020-10-09 10:15:54 +02:00
Richard Biener 36500ed18a fix ICE with BB vectorization of PHIs
This fixes a vector CTOR insertion issue when we try to insert after
a PHI node.

2020-10-09  Richard Biener  <rguenther@suse.de>

	* tree-vect-slp.c (vect_create_constant_vectors): Properly insert
	after PHIs.

	* gcc.dg/vect/bb-slp-phis-1.c: New testcase.
2020-10-09 08:59:15 +02:00
GCC Administrator da9df69975 Daily bump. 2020-10-09 00:16:27 +00:00
Patrick Palka 9158a4d2a6 libstdc++: Make ranges::construct_at constexpr-friendly [PR95788]
This rewrites ranges::construct_at in terms of std::construct_at so
that we can piggyback on the compiler's existing support for
intercepting placement new within std::construct_at during constexpr
evaluation, instead of having to additionally teach the compiler about
ranges::construct_at.

While we're making changes to ranges::construct_at, this patch also
declares it conditionally noexcept and qualifies the calls to declval in
its requires-clause.

libstdc++-v3/ChangeLog:

	PR libstdc++/95788
	* include/bits/ranges_uninitialized.h:
	(__construct_at_fn::operator()): Rewrite in terms of
	std::construct_at.  Declare it conditionally noexcept.  Qualify
	calls to declval in its requires-clause.
	* testsuite/20_util/specialized_algorithms/construct_at/95788.cc:
	New test.
2020-10-08 18:10:05 -04:00
Jason Merrill 1c56c143b2 c++: Fix member alias template in C++17 and up. [PR96805]
Here we're trying to push into a<T>::c<N> in order to instantiate t<N>, but
were building a TYPENAME_TYPE for it because a<T> isn't open yet.  Don't
do that when we know we're trying to enter the scope.

gcc/cp/ChangeLog:

	PR c++/96805
	PR c++/96199
	* pt.c (tsubst_aggr_type): Don't build a TYPENAME_TYPE when
	entering_scope.
	(tsubst_template_decl): Use tsubst_aggr_type.

gcc/testsuite/ChangeLog:

	PR c++/96805
	* g++.dg/cpp0x/alias-decl-pr96805.C: New test.
2020-10-08 16:53:36 -04:00
Alexandre Oliva a500588aa5 take type from intrinsic in sincos pass
This is a first step towards enabling the sincos optimization in Ada.

The issue this patch solves is that sincos takes the type to be looked
up with mathfn_built_in from variables or temporaries passed as
arguments to SIN and COS intrinsics.  In Ada, different float types
may be used but, despite their representation equivalence, their
distinctness causes the optimization to be skipped, because they are
not the types that mathfn_built_in expects.

This patch introduces a function that maps intrinsics to the type
they're associated with, and uses that type, obtained from the
intrinsics used in calls to be optimized, to look up the correspoding
CEXPI intrinsic.

For the sake of defensive programming, when using the type obtained
from the intrinsic, it now checks that, if different types are found
for the used argument, or for other calls that use it, that the types
are interchangeable.


for  gcc/ChangeLog

	* builtins.c (mathfn_built_in_type): New.
	* builtins.h (mathfn_built_in_type): Declare.
	* tree-ssa-math-opts.c (execute_cse_sincos_1): Use it to
	obtain the type expected by the intrinsic.
2020-10-08 17:12:18 -03:00
Nathan Sidwell d1c566d72d libcpp: Directly peek for initial line marker
Using the tokenizer to sniff for an initial line marker for
preprocessed input is a little brittle, particularly with
-fdirectives-only.  If there is no marker we'll happily munch initial
comments.  This patch directly sniffs the buffer.  This is safe
because the initial line marker was machine generated and must be
right at the beginning of the file.  Anything else is not such a line
marker.  The same is true for the initial directory marker.  For that
tokenizing the string is simplest, but at that point it's either a
regular line marker or a directory marker.  If it's a regular marker,
unwinding tokens is fine.

	libcpp/
	* internal.h (enum include_type): Rename IT_MAIN_INJECT to
	IT_PRE_MAIN.
	* init.c (cpp_read_main_file): If there is no line marker, adjust
	the initial line marker.
	(read_original_filename): Return bool, peek the buffer directly
	before trying to tokenize.
	(read_original_directory): Likewise.  Directly prod the string
	literal.
	* files.c (_cpp_stack_file): Adjust for IT_PRE_MAIN change.
2020-10-08 12:16:21 -07:00
Will Schmidt cd23ed8af2 [PATCH, rs6000] Rename BU_P10_MISC_2 define to BU_P10_POWERPC64_MISC_2
Rename our BU_P10_MISC_2 built-in define macro to be
BU_P10_POWERPC64_MISC_2.   This more accurately reflects
that the macro includes the RS6000_BTM_POWERPC64 entry,
and matches the style we used for the P7 equivalent.

gcc/ChangeLog:
	* config/rs6000/rs6000-builtin.def (BU_P10_MISC_2): Rename
	to BU_P10_POWERPC64_MISC_2.
	CFUGED, CNTLZDM, CNTTZDM, PDEPD, PEXTD): Call renamed macro.
2020-10-08 10:39:10 -05:00
Jan Hubicka 3e1123e52f Disable TBAA in some uses of call_may_clobber_ref_p
* tree-nrv.c (dest_safe_for_nrv_p): Disable tbaa in
	call_may_clobber_ref_p and ref_maybe_used_by_stmt_p.
	* tree-tailcall.c (find_tail_calls): Likewise.
	* tree-ssa-alias.c (call_may_clobber_ref_p): Add tbaa_p parameter.
	* tree-ssa-alias.h (call_may_clobber_ref_p): Update prototype.
	* tree-ssa-sccvn.c (vn_reference_lookup_3): Pass data->tbaa_p
	to call_may_clobber_ref_p_1.
2020-10-08 17:23:16 +02:00
Mark Wielaard 3a9e6ee42a debug: Make sure to output .file 0 when generating DWARF5.
When gas outputs DWARF5 .debug_line[_str] then we have to tell it the
comp_dir and main file name for the zero entry line table. Otherwise
gas has to guess at the CU compilation directory and file.

Before a gcc -gdwarf-5 ../src/hello.c line table looked like:

Directory table:
 0     ../src (24)
 1     ../src (24)
 2     /usr/include (31)

File name table:
 0     hello.c (16),  0
 1     hello.c (16),  1
 2     stdio.h (44),  2

With this patch it looks like:

Directory table:
 0     /tmp/obj (0)
 1     ../src (24)
 2     /usr/include (31)

File name table:
 0     ../src/hello.c (9),  0
 1     hello.c (16),  1
 2     stdio.h (44),  2

gcc/ChangeLog:

	* dwarf2out.c (dwarf2out_finish): Emit .file 0 entry when
	generating DWARF5 .debug_line table through gas.
2020-10-08 17:17:00 +02:00
qing zhao baf4750fea Improve documentation of -fallow-store-data-races
2020-10-08  John Henning  <john.henning@oracle.com>

gcc/

	PR other/97309
	* doc/invoke.texi: Improve documentation of
	-fallow-store-data-races.
2020-10-08 17:01:07 +02:00
Jonathan Wakely b2a96bf9dc libstdc++: Add assertions for preconditions in sampling distributions [PR 82584]
These three distributions all require 0 < S where S is the sum of the
weights. When the sum is zero there's an undefined FP division by zero.
Add assertions to help users diagnose the problem.

libstdc++-v3/ChangeLog:

	PR libstdc++/82584
	* include/bits/random.tcc
	(discrete_distribution::param_type::_M_initialize)
	(piecewise_constant_distribution::param_type::_M_initialize)
	(piecewise_linear_distribution::param_type::_M_initialize):
	Add assertions for positive sums..
	* testsuite/26_numerics/random/pr60037-neg.cc: Adjust dg-error
	line.
2020-10-08 15:24:21 +01:00
Christophe Lyon 5a448362da arm: [MVE] Add missing __arm_vcvtnq_u32_f32 intrinsic (PR 96914)
__arm_vcvtnq_u32_f32 was missing from arm_mve.h, although the s32_f32 and
[su]16_f16 versions were present.

This patch adds the missing version and testcase, which are
cut-and-paste from the other versions.

2020-10-08  Christophe Lyon  <christophe.lyon@linaro.org>

	gcc/
	PR target/96914
	* config/arm/arm_mve.h (__arm_vcvtnq_u32_f32): New.

	gcc/testsuite/
	PR target/96914
	* gcc.target/arm/mve/intrinsics/vcvtnq_u32_f32.c: New test.
2020-10-08 14:18:45 +00:00
Richard Biener 181702ef8a SLP vectorize multiple BBs at once
This work from Martin Liska was motivated by gcc.dg/vect/bb-slp-22.c
which shows how poorly we currently BB vectorize code like

  a0 = in[0] + 23;
  a1 = in[1] + 142;
  a2 = in[2] + 2;
  a3 = in[3] + 31;

  if (x > y)
    {
      b[0] = a0;
      b[1] = a1;
      b[2] = a2;
      b[3] = a3;
    }
  else
    {
      out[0] = a0 * (x + 1);
      out[1] = a1 * (y + 1);
      out[2] = a2 * (x + 1);
      out[3] = a3 * (y + 1);
    }

namely by vectorizing the stores but not the common load (and add)
they are feeded with.

Thus with the following patch we change the BB vectorizer from
operating on a single basic-block at a time to consider somewhat
larger regions (but not the whole function yet because of issues
with vector size iteration).

I took the opportunity to remove the fancy region iterations again
now that we operate on BB granularity and in the end need to visit
PHI nodes as well.

2020-10-08  Martin Liska  <mliska@suse.cz>
	    Richard Biener  <rguenther@suse.de>

	* tree-vectorizer.h (_bb_vec_info::const_iterator): Remove.
	(_bb_vec_info::const_reverse_iterator): Likewise.
	(_bb_vec_info::region_stmts): Likewise.
	(_bb_vec_info::reverse_region_stmts): Likewise.
	(_bb_vec_info::_bb_vec_info): Adjust.
	(_bb_vec_info::bb): Remove.
	(_bb_vec_info::region_begin): Remove.
	(_bb_vec_info::region_end): Remove.
	(_bb_vec_info::bbs): New vector of BBs.
	(vect_slp_function): Declare.
	* tree-vect-patterns.c (vect_determine_precisions): Use
	regular stmt iteration.
	(vect_pattern_recog): Likewise.
	* tree-vect-slp.c: Include cfganal.h, tree-eh.h and tree-cfg.h.
	(vect_build_slp_tree_1): Properly refuse to vectorize
	volatile and throwing stmts.
	(vect_build_slp_tree_2): Pass group-size down to
	get_vectype_for_scalar_type.
	(_bb_vec_info::_bb_vec_info): Use regular stmt iteration,
	adjust for changed region specification.
	(_bb_vec_info::~_bb_vec_info): Likewise.
	(vect_slp_check_for_constructors): Likewise.
	(vect_slp_region): Likewise.
	(vect_slp_bbs): New worker operating on a vector of BBs.
	(vect_slp_bb): Wrap it.
	(vect_slp_function): New function splitting the function
	into multi-BB regions.
	(vect_create_constant_vectors): Handle the case of inserting
	after a throwing def.
	(vect_schedule_slp_instance): Adjust.
	* tree-vectorizer.c (vec_info::remove_stmt): Simplify again.
	(vec_info::insert_seq_on_entry): Adjust.
	(pass_slp_vectorize::execute): Also init PHIs.  Call
	vect_slp_function.

	* gcc.dg/vect/bb-slp-22.c: Adjust.
	* gfortran.dg/pr68627.f: Likewise.
2020-10-08 16:07:15 +02:00
Jonathan Wakely f997b67550 libstdc++: Add C++11 member functions for ios::failure in old ABI
The new constructors that C++11 added to std::ios_base::failure were
missing for the old ABI. This adds them, but just ignores the
std::error_code argument (because there's nowhere to store it).

This also adds a code() member, which should be provided by the
std::system_error base class, but that base class isn't present in the
old ABI.

This allows the old ios::failure to be used in code that expects the new
API, although with reduced functionality.

libstdc++-v3/ChangeLog:

	* include/bits/ios_base.h (ios_base::failure): Add constructors
	takeing error_code argument. Add code() member function.
	* testsuite/27_io/ios_base/failure/cxx11.cc: Allow test to
	run for the old ABI but do not check for derivation from
	std::system_error.
	* testsuite/27_io/ios_base/failure/error_code.cc: New test.
2020-10-08 14:45:37 +01:00
Jonathan Wakely c06617a79b libstdc++: Avoid divide by zero in default template arguments
My previous attempt to fix this only worked when m is a power of two.
There is still a bug when a=00 and !has_single_bit(m).

Instead of trying to make _Mod work for a==0 this change ensures that we
never instantiate it with a==0. For C++17 we can use if-constexpr, but
otherwise we need to use a different multipler. It doesn't matter what
we use, as it won't actually be called, only instantiated.

libstdc++-v3/ChangeLog:

	* include/bits/random.h (__detail::_Mod): Revert last change.
	(__detail::__mod): Do not use _Mod for a==0 case.
	* testsuite/26_numerics/random/linear_congruential_engine/operators/call.cc:
	Check other cases with a==0. Also check runtime results.
	* testsuite/26_numerics/random/pr60037-neg.cc: Adjust dg-error
	line.
2020-10-08 14:45:36 +01:00