Commit Graph

185962 Commits

Author SHA1 Message Date
Trevor Saunders
e9681f5725 auto_vec copy/move improvements
- Unfortunately using_auto_storage () needs to handle m_vec being null.
- Handle self move of an auto_vec to itself.
- Make sure auto_vec defines the classes move constructor and assignment
  operator, as well as ones taking vec<T>, so the compiler does not generate
them for us.  Per https://en.cppreference.com/w/cpp/language/move_constructor
the ones taking vec<T> do not count as the classes move constructor or
assignment operator, but we want them as well to assign a plain vec to a
auto_vec.
- Explicitly delete auto_vec's copy constructor and assignment operator.  This
  prevents unintentional expenssive coppies of the vector and makes it clear
when coppies are needed that that is what is intended.  When it is necessary to
copy a vector copy () can be used.

Signed-off-by: Trevor Saunders <tbsaunde@tbsaunde.org>

gcc/ChangeLog:

	* vec.h (vl_ptr>::using_auto_storage): Handle null m_vec.
	(auto_vec<T, 0>::auto_vec): Define move constructor, and delete copy
	constructor.
	(auto_vec<T, 0>::operator=): Define move assignment and delete copy
	assignment.
2021-06-17 04:43:26 -04:00
Aldy Hernandez
3f3ee13959 Add debugging helpers for ranger.
These are debugging aids for help in debugging ranger based passes.

gcc/ChangeLog:

	* gimple-range.cc (debug_seed_ranger): New.
	(dump_ranger): New.
	(debug_ranger): New.
2021-06-17 10:29:28 +02:00
Richard Biener
3dfa4fe9f1 Vectorization of BB reductions
This adds a simple reduction vectorization capability to the
non-loop vectorizer.  Simple meaning it lacks any of the fancy
ways to generate the reduction epilogue but only supports
those we can handle via a direct internal function reducing
a vector to a scalar.  One of the main reasons is to avoid
massive refactoring at this point but also that more complex
epilogue operations are hardly profitable.

Mixed sign reductions are for now fend off and I'm not finally
settled with whether we want an explicit SLP node for the
reduction epilogue operation.  Handling mixed signs could be
done by multiplying with a { 1, -1, .. } vector.  Fend off
are also reductions with non-internal operands (constants
or register parameters for example).

Costing is done by accounting the original scalar participating
stmts for the scalar cost and log2 permutes and operations for
the vectorized epilogue.

--

SPEC CPU 2017 FP with rate workload measurements show (picked
fastest runs of three) regressions for 507.cactuBSSN_r (1.5%),
508.namd_r (2.5%), 511.povray_r (2.5%), 526.blender_r (0.5) and
527.cam4_r (2.5%) and improvements for 510.parest_r (5%) and
538.imagick_r (1.5%).  This is with -Ofast -march=znver2 on a Zen2.

Statistics on CPU 2017 shows that the overwhelming number of seeds
we find are reductions of two lanes (well - that's basically every
associative operation).  That means we put a quite high pressure
on the SLP discovery process this way.

In total we find 583218 seeds we put to SLP discovery out of which
66205 pass that and only 6185 of those make it through
code generation checks. 796 of those are discarded because the reduction
is part of a larger SLP instance.  4195 of the remaining
are deemed not profitable to vectorize and 1194 are finally
vectorized.  That's a poor 0.2% rate.

Of the 583218 seeds 486826 (83%) have two lanes, 60912 have three (10%),
28181 four (5%), 4808 five, 909 six and there are instances up to 120
lanes.

There's a set of 54086 candidate seeds we reject because
they contain a constant or invariant (not implemented yet) but still
have two or more lanes that could be put to SLP discovery.

2021-06-16  Richard Biener   <rguenther@suse.de>

	PR tree-optimization/54400
	* tree-vectorizer.h (enum slp_instance_kind): Add
	slp_inst_kind_bb_reduc.
	(reduction_fn_for_scalar_code): Declare.
	* tree-vect-data-refs.c (vect_slp_analyze_instance_dependence):
	Check SLP_INSTANCE_KIND instead of looking at the
	representative.
	(vect_slp_analyze_instance_alignment): Likewise.
	* tree-vect-loop.c (reduction_fn_for_scalar_code): Export.
	* tree-vect-slp.c (vect_slp_linearize_chain): Split out
	chain linearization from vect_build_slp_tree_2 and generalize
	for the use of BB reduction vectorization.
	(vect_build_slp_tree_2): Adjust accordingly.
	(vect_optimize_slp): Elide permutes at the root of BB reduction
	instances.
	(vectorizable_bb_reduc_epilogue): New function.
	(vect_slp_prune_covered_roots): Likewise.
	(vect_slp_analyze_operations): Use them.
	(vect_slp_check_for_constructors): Recognize associatable
	chains for BB reduction vectorization.
	(vectorize_slp_instance_root_stmt): Generate code for the
	BB reduction epilogue.

	* gcc.dg/vect/bb-slp-pr54400.c: New testcase.
2021-06-17 09:52:07 +02:00
Aldy Hernandez
9f12bd79c0 Add amacleod and aldyh as *vrp and ranger maintainers.
ChangeLog:

	* MAINTAINERS (Various Maintainers): Add Andrew and myself
	as *vrp and ranger maintainers.
2021-06-17 09:49:43 +02:00
Arnaud Charlet
607507410e [Ada] Use runtime from base compiler during stage1 (continued)
gcc/ada/

	* gcc-interface/Make-lang.in: Use libgnat.so if libgnat.a cannot
	be found.
2021-06-17 01:45:05 -04:00
Jason Merrill
ff4deb4b1d c++: Tweak PR101029 fix
The case of an initializer with side effects for a zero-length array seems
extremely unlikely, but we should still return the right type in that case.

	PR c++/101029

gcc/cp/ChangeLog:

	* init.c (build_vec_init): Preserve the type of base.
2021-06-16 23:38:32 -04:00
GCC Administrator
9a61dfdb5e Daily bump. 2021-06-17 00:16:54 +00:00
Andrew MacLeod
786188e8b8 Add recomputation to outgoing_edge_range.
The gori engine can calculate outgoing ranges for exported values.  This
change allows 1st degree recomputation.  If a name is not exported from a
block, but one of the ssa_names used directly in computing it is, then
we can recompute the ssa_name on the edge using the edge values for its
operands.

	* gimple-range-gori.cc (gori_compute::has_edge_range_p): Check with
	may_recompute_p.
	(gori_compute::may_recompute_p): New.
	(gori_compute::outgoing_edge_range_p): Perform recomputations.
	* gimple-range-gori.h (class gori_compute): Add prototype.
2021-06-16 20:07:40 -04:00
Andrew MacLeod
8a22a10c78 Range_on_edge in ranger_cache should return true for all ranges.
Range_on_edge was implemented in the cache to always return a range, but
only returned true when the edge actally changed the range.
Return true with any range that can be calculated.

	* gimple-range-cache.cc (ranger_cache::range_on_edge): Always return
	true when a range can be calculated.
	* gimple-range.cc (gimple_ranger::dump_bb): Check has_edge_range_p.
2021-06-16 20:07:40 -04:00
Martin Sebor
487be9201c Correct documented option defaults.
gcc/ChangeLog:

	* doc/invoke.texi (-Wmismatched-dealloc, -Wmismatched-new-delete):
	Correct documented defaults.
2021-06-16 16:52:05 -06:00
Jason Merrill
6816a44dfe c++: static memfn from non-dependent base [PR101078]
After my patch for PR91706, or before that with the qualified call,
tsubst_baselink returned a BASELINK with BASELINK_BINFO indicating a base of
a still-dependent derived class.  We need to look up the relevant base binfo
in the substituted class.

	PR c++/101078
	PR c++/91706

gcc/cp/ChangeLog:

	* pt.c (tsubst_baselink): Update binfos in non-dependent case.

gcc/testsuite/ChangeLog:

	* g++.dg/template/access39.C: New test.
2021-06-16 17:28:36 -04:00
Harald Anlauf
cfe0a2ec26 Fortran - ICE in gfc_check_do_variable, at fortran/parse.c:4446
Avoid NULL pointer dereferences during error recovery.

gcc/fortran/ChangeLog:

	PR fortran/95501
	PR fortran/95502
	* expr.c (gfc_check_pointer_assign): Avoid NULL pointer
	dereference.
	* match.c (gfc_match_pointer_assignment): Likewise.
	* parse.c (gfc_check_do_variable): Avoid comparison with NULL
	symtree.

gcc/testsuite/ChangeLog:

	PR fortran/95501
	PR fortran/95502
	* gfortran.dg/pr95502.f90: New test.
2021-06-16 22:04:22 +02:00
Harald Anlauf
d117f992d8 Revert "Fortran - ICE in gfc_check_do_variable, at fortran/parse.c:4446"
This reverts commit 72e3d92178.
2021-06-16 22:00:52 +02:00
Harald Anlauf
72e3d92178 Fortran - ICE in gfc_check_do_variable, at fortran/parse.c:4446
Avoid NULL pointer dereferences during error recovery.

gcc/fortran/ChangeLog:

	PR fortran/95501
	PR fortran/95502
	* expr.c (gfc_check_pointer_assign): Avoid NULL pointer
	dereference.
	* match.c (gfc_match_pointer_assignment): Likewise.
	* parse.c (gfc_check_do_variable): Avoid comparison with NULL
	symtree.

gcc/testsuite/ChangeLog:

	PR fortran/95501
	PR fortran/95502
	* gfortran.dg/pr95502.f90: New test.
2021-06-16 21:54:16 +02:00
Andrew MacLeod
bdfc1207bd Avoid loading an undefined value in the ranger_cache constructor.
Enable_new_values takes a boolean, returning the old value.  The constructor
for ranger_cache initialized the m_new_value_p field by calling this routine
and ignorng the result.  This potentially loads the old value uninitialized.

	* gimple-range-cache.cc (ranger_cache::ranger_cache): Initialize
	m_new_value_p directly.
2021-06-16 13:01:21 -04:00
Jason Merrill
9e64426dae libcpp: location comparison within macro [PR100796]
The patch for 96391 changed linemap_compare_locations to give up on
comparing locations from macro expansions if we don't have column
information.  But in this testcase, the BOILERPLATE macro is multiple lines
long, so we do want to compare locations within the macro.  So this patch
moves the LINE_MAP_MAX_LOCATION_WITH_COLS check inside the block, to use it
for failing gracefully.

	PR c++/100796
	PR preprocessor/96391

libcpp/ChangeLog:

	* line-map.c (linemap_compare_locations): Only use comparison with
	LINE_MAP_MAX_LOCATION_WITH_COLS to avoid abort.

gcc/testsuite/ChangeLog:

	* g++.dg/plugin/location-overflow-test-pr100796.c: New test.
	* g++.dg/plugin/plugin.exp: Run it.
2021-06-16 11:41:08 -04:00
Uros Bizjak
dd835ec24b ii386: Add missing two element 64bit vector permutations [PR89021]
In addition to V8QI permutations, several other missing permutations are
added for 64bit vector modes for TARGET_SSSE3 and TARGET_SSE4_1 targets.

2021-06-16  Uroš Bizjak  <ubizjak@gmail.com>

gcc/
	PR target/89021
	* config/i386/i386-expand.c (expand_vec_perm_2perm_pblendv):
	Handle 64bit modes for TARGET_SSE4_1.
	(expand_vec_perm_pshufb2): Handle 64bit modes for TARGET_SSSE3.
	(expand_vec_perm_even_odd_pack): Handle V4HI mode.
	(expand_vec_perm_even_odd_1) <case E_V4HImode>: Expand via
	expand_vec_perm_pshufb2 for TARGET_SSSE3 and via
	expand_vec_perm_even_odd_pack for TARGET_SSE4_1.
	* config/i386/mmx.md (mmx_packusdw): New insn pattern.
2021-06-16 16:07:52 +02:00
Jonathan Wakely
c25e3bf879 libstdc++: Use named struct for __decay_copy
In r12-1486-gcb326a6442f09cb36b05ce556fc91e10bfeb0cf6 I changed
__decay_copy to be a function object of unnamed class type. This causes
problems when importing the library headers:

error: conflicting global module declaration 'constexpr const std::ranges::__cust_access::<unnamed struct> std::ranges::__cust_access::__decay_copy'

The fix is to use a named struct instead of an anonymous one.

Signed-off-by: Jonathan Wakely <jwakely@redhat.com>

libstdc++-v3/ChangeLog:

	* include/bits/iterator_concepts.h (__decay_copy): Name type.
2021-06-16 14:31:13 +01:00
Jonathan Wakely
b9e35ee6d6 libstdc++: Revert final/non-addressable changes to ranges CPOs
In r12-1489-g8b93548778a487f31f21e0c6afe7e0bde9711fc4 I made the
[range.access] CPO types final and non-addressable. Tim Song pointed out
this is wrong. Only the [range.iter.ops] functions should be final and
non-addressable. Revert the changes to the [range.access] objects.

Signed-off-by: Jonathan Wakely <jwakely@redhat.com>

libstdc++-v3/ChangeLog:

	* include/bits/ranges_base.h (ranges::begin, ranges::end)
	(ranges::cbegin, ranges::cend, ranges::rbeing, ranges::rend)
	(ranges::crbegin, ranges::crend, ranges::size, ranges::ssize)
	(ranges::empty, ranges::data, ranges::cdata): Remove final
	keywords and deleted operator& overloads.
	* testsuite/24_iterators/customization_points/iter_move.cc: Use
	new is_customization_point_object function.
	* testsuite/24_iterators/customization_points/iter_swap.cc:
	Likewise.
	* testsuite/std/concepts/concepts.lang/concept.swappable/swap.cc:
	Likewise.
	* testsuite/std/ranges/access/begin.cc: Likewise.
	* testsuite/std/ranges/access/cbegin.cc: Likewise.
	* testsuite/std/ranges/access/cdata.cc: Likewise.
	* testsuite/std/ranges/access/cend.cc: Likewise.
	* testsuite/std/ranges/access/crbegin.cc: Likewise.
	* testsuite/std/ranges/access/crend.cc: Likewise.
	* testsuite/std/ranges/access/data.cc: Likewise.
	* testsuite/std/ranges/access/empty.cc: Likewise.
	* testsuite/std/ranges/access/end.cc: Likewise.
	* testsuite/std/ranges/access/rbegin.cc: Likewise.
	* testsuite/std/ranges/access/rend.cc: Likewise.
	* testsuite/std/ranges/access/size.cc: Likewise.
	* testsuite/std/ranges/access/ssize.cc: Likewise.
	* testsuite/util/testsuite_iterators.h
	(is_customization_point_object): New function.
2021-06-16 14:31:04 +01:00
Jonathan Wright
dbfc149b63 aarch64: Model zero-high-half semantics of ADDHN/SUBHN instructions
Model the zero-high-half semantics of the narrowing arithmetic Neon
instructions in the aarch64_<sur><addsub>hn<mode> RTL pattern.
Modeling these semantics allows for better RTL combinations while
also removing some register allocation issues as the compiler now
knows that the operation is totally destructive.

Add new tests to narrow_zero_high_half.c to verify the benefit of
this change.

gcc/ChangeLog:

2021-06-14  Jonathan Wright  <jonathan.wright@arm.com>

	* config/aarch64/aarch64-simd.md (aarch64_<sur><addsub>hn<mode>):
	Change to an expander that emits the correct instruction
	depending on endianness.
	(aarch64_<sur><addsub>hn<mode>_insn_le): Define.
	(aarch64_<sur><addsub>hn<mode>_insn_be): Define.

gcc/testsuite/ChangeLog:

	* gcc.target/aarch64/narrow_zero_high_half.c: Add new tests.
2021-06-16 14:22:42 +01:00
Jonathan Wright
d0889b5d37 aarch64: Model zero-high-half semantics of [SU]QXTN instructions
Split the aarch64_<su>qmovn<mode> pattern into separate scalar and
vector variants. Further split the vector RTL  pattern into big/
little endian variants that model the zero-high-half semantics of the
underlying instruction. Modeling these semantics allows for better
RTL combinations while also removing some register allocation issues
as the compiler now knows that the operation is totally destructive.

Add new tests to narrow_zero_high_half.c to verify the benefit of
this change.

gcc/ChangeLog:

2021-06-14  Jonathan Wright  <jonathan.wright@arm.com>

	* config/aarch64/aarch64-simd-builtins.def: Split generator
	for aarch64_<su>qmovn builtins into scalar and vector
	variants.
	* config/aarch64/aarch64-simd.md (aarch64_<su>qmovn<mode>_insn_le):
	Define.
	(aarch64_<su>qmovn<mode>_insn_be): Define.
	(aarch64_<su>qmovn<mode>): Split into scalar and vector
	variants. Change vector variant to an expander that emits the
	correct instruction depending on endianness.

gcc/testsuite/ChangeLog:

	* gcc.target/aarch64/narrow_zero_high_half.c: Add new tests.
2021-06-16 14:22:22 +01:00
Jonathan Wright
c86a303968 aarch64: Model zero-high-half semantics of SQXTUN instruction in RTL
Split the aarch64_sqmovun<mode> pattern into separate scalar and
vector variants. Further split the vector pattern into big/little
endian variants that model the zero-high-half semantics of the
underlying instruction. Modeling these semantics allows for better
RTL combinations while also removing some register allocation issues
as the compiler now knows that the operation is totally destructive.

Add new tests to narrow_zero_high_half.c to verify the benefit of
this change.

gcc/ChangeLog:

2021-06-14  Jonathan Wright  <jonathan.wright@arm.com>

	* config/aarch64/aarch64-simd-builtins.def: Split generator
	for aarch64_sqmovun builtins into scalar and vector variants.
	* config/aarch64/aarch64-simd.md (aarch64_sqmovun<mode>):
	Split into scalar and vector variants. Change vector variant
	to an expander that emits the correct instruction depending
	on endianness.
	(aarch64_sqmovun<mode>_insn_le): Define.
	(aarch64_sqmovun<mode>_insn_be): Define.

gcc/testsuite/ChangeLog:

	* gcc.target/aarch64/narrow_zero_high_half.c: Add new tests.
2021-06-16 14:22:08 +01:00
Jonathan Wright
d8a88cdae9 aarch64: Model zero-high-half semantics of XTN instruction in RTL
Modeling the zero-high-half semantics of the XTN narrowing
instruction in RTL indicates to the compiler that this is a totally
destructive operation. This enables more RTL simplifications and also
prevents some register allocation issues.

Add new tests to narrow_zero_high_half.c to verify the benefit of
this change.

gcc/ChangeLog:

2021-06-11  Jonathan Wright  <jonathan.wright@arm.com>

	* config/aarch64/aarch64-simd.md (aarch64_xtn<mode>_insn_le):
	Define - modeling zero-high-half semantics.
	(aarch64_xtn<mode>): Change to an expander that emits the
	appropriate instruction depending on endianness.
	(aarch64_xtn<mode>_insn_be): Define - modeling zero-high-half
	semantics.
	(aarch64_xtn2<mode>_le): Rename to...
	(aarch64_xtn2<mode>_insn_le): This.
	(aarch64_xtn2<mode>_be): Rename to...
	(aarch64_xtn2<mode>_insn_be): This.
	(vec_pack_trunc_<mode>): Emit truncation instruction instead
	of aarch64_xtn.
	* config/aarch64/iterators.md (Vnarrowd): Add Vnarrowd mode
	attribute iterator.

gcc/testsuite/ChangeLog:

	* gcc.target/aarch64/narrow_zero_high_half.c: Add new tests.
2021-06-16 14:21:52 +01:00
Jonathan Wright
ac6c858d07 testsuite: aarch64: Add zero-high-half tests for narrowing shifts
Add tests to verify that Neon narrowing-shift instructions clear the
top half of the result vector. It is sufficient to show that a
subsequent combine with a zero-vector is optimized away - leaving
just the narrowing-shift instruction.

gcc/testsuite/ChangeLog:

2021-06-15  Jonathan Wright  <jonathan.wright@arm.com>

	* gcc.target/aarch64/narrow_zero_high_half.c: New test.
2021-06-16 14:21:34 +01:00
Martin Jambor
d7deee423f
tree-sra: Do not refresh readonly decls (PR 100453)
When SRA transforms an assignment where the RHS is an aggregate decl
that it creates replacements for, the (least efficient) fallback
method of dealing with them is to store all the replacements back into
the original decl and then let the original assignment takes its
course.

That of course should not need to be done for TREE_READONLY bases
which cannot change contents.  The SRA code handled this situation
only for DECL_IN_CONSTANT_POOL const decls, this patch modifies the
check so that it tests for TREE_READONLY and I also looked at all
other callers of generate_subtree_copies and added checks to another
one dealing with the same exact situation and one which deals with it
in a non-assignment context.

gcc/ChangeLog:

2021-06-11  Martin Jambor  <mjambor@suse.cz>

	PR tree-optimization/100453
	* tree-sra.c (create_access): Disqualify any const candidates
	which are written to.
	(sra_modify_expr): Do not store sub-replacements back to a const base.
	(handle_unscalarized_data_in_subtree): Likewise.
	(sra_modify_assign): Likewise.  Earlier, use TREE_READONLy test
	instead of constant_decl_p.

gcc/testsuite/ChangeLog:

2021-06-10  Martin Jambor  <mjambor@suse.cz>

	PR tree-optimization/100453
	* gcc.dg/tree-ssa/pr100453.c: New test.
2021-06-16 13:23:14 +02:00
Jakub Jelinek
a490b1dc0b testsuite: Use noipa attribute instead of noinline, noclone
I've noticed this test now on various arches sometimes FAILs, sometimes
PASSes (the line 12 test in particular).

The problem is that a = 0; initialization in the caller no longer happens
before the f(&a) call as what the argument points to is only used in
debug info.

Making the function noipa forces the caller to initialize it and still
tests what the test wants to test, namely that we don't consider *p as
valid location for the c variable at line 18 (after it has been overwritten
with *p = 1;).

2021-06-16  Jakub Jelinek  <jakub@redhat.com>

	* gcc.dg/guality/pr49888.c (f): Use noipa attribute instead of
	noinline, noclone.
2021-06-16 13:10:48 +02:00
Jakub Jelinek
b4b50bf286 stor-layout: Create DECL_BIT_FIELD_REPRESENTATIVE even for bitfields in unions [PR101062]
The following testcase is miscompiled on x86_64-linux, the bitfield store
is implemented as a RMW 64-bit operation at d+24 when the d variable has
size of only 28 bytes and scheduling moves in between the R and W part
a store to a different variable that happens to be right after the d
variable.

The reason for this is that we weren't creating
DECL_BIT_FIELD_REPRESENTATIVEs for bitfields in unions.

The following patch does create them, but treats all such bitfields as if
they were in a structure where the particular bitfield is the only field.

2021-06-16  Jakub Jelinek  <jakub@redhat.com>

	PR middle-end/101062
	* stor-layout.c (finish_bitfield_representative): For fields in unions
	assume nextf is always NULL.
	(finish_bitfield_layout): Compute bit field representatives also in
	unions, but handle it as if each bitfield was the only field in the
	aggregate.

	* gcc.dg/pr101062.c: New test.
2021-06-16 12:17:55 +02:00
Richard Biener
43fc4234ad tree-optimization/101088 - fix SM invalidation issue
When we face a sm_ord vs sm_unord for the same ref during
store sequence merging we assert that the ref is already marked
unsupported.  But it can be that it will only be marked so
during the ongoing merging so instead of asserting mark it here.

Also apply some optimization to not waste resources to search
for already unsupported refs.

2021-06-16  Richard Biener  <rguenther@suse.de>

	PR tree-optimization/101088
	* tree-ssa-loop-im.c (sm_seq_valid_bb): Only look for
	supported refs on edges.  Do not assert same ref but
	different kind stores are unsuported but mark them so.
	(hoist_memory_references): Only look for supported refs
	on exits.

	* gcc.dg/torture/pr101088.c: New testcase.
2021-06-16 11:28:03 +02:00
Roger Sayle
3155d51bfd [PATCH] PR rtl-optimization/46235: Improved use of bt for bit tests on x86_64.
This patch tackles PR46235 to improve the code generated for bit tests
on x86_64 by making more use of the bt instruction.  Currently, GCC emits
bt instructions when followed by condition jumps (thanks to Uros' splitters).
This patch adds splitters in i386.md, to catch the cases where bt is followed
by a conditional move (as in the original report), or by a setc/setnc (as in
comment 5 of the Bugzilla PR).

With this patch, the function in the original PR
int foo(int a, int x, int y) {
    if (a & (1 << x))
       return a;
   return 1;
}

which with -O2 on mainline generates:
foo:	movl    %edi, %eax
        movl    %esi, %ecx
        sarl    %cl, %eax
        testb   $1, %al
        movl    $1, %eax
        cmovne  %edi, %eax
        ret

now generates:
foo:	btl     %esi, %edi
        movl    $1, %eax
        cmovc   %edi, %eax
        ret

Likewise, IsBitSet1 and IsBitSet2 (from comment 5)
bool IsBitSet1(unsigned char byte, int index) {
    return (byte & (1<<index)) != 0;
}
bool IsBitSet2(unsigned char byte, int index) {
    return (byte >> index) & 1;
}

Before:
        movzbl  %dil, %eax
        movl    %esi, %ecx
        sarl    %cl, %eax
        andl    $1, %eax
        ret

After:
        movzbl  %dil, %edi
        btl     %esi, %edi
        setc    %al
        ret

According to Agner Fog, SAR/SHR r,cl takes 2 cycles on skylake,
where BT r,r takes only one, so the performance improvements on
recent hardware may be more significant than implied by just
the reduced number of instructions.  I've avoided transforming cases
(such as btsi_setcsi) where using bt sequences may not be a clear
win (over sarq/andl).

2010-06-15  Roger Sayle  <roger@nextmovesoftware.com>

gcc/ChangeLog
	PR rtl-optimization/46235
	* config/i386/i386.md: New define_split for bt followed by cmov.
	(*bt<mode>_setcqi): New define_insn_and_split for bt followed by setc.
	(*bt<mode>_setncqi): New define_insn_and_split for bt then setnc.
	(*bt<mode>_setnc<mode>): New define_insn_and_split for bt followed
	by setnc with zero extension.

gcc/testsuite/ChangeLog
	PR rtl-optimization/46235
	* gcc.target/i386/bt-5.c: New test.
	* gcc.target/i386/bt-6.c: New test.
	* gcc.target/i386/bt-7.c: New test.
2021-06-16 09:56:09 +01:00
Jakub Jelinek
041f741770 libffi: Fix up x86_64 classify_argument
As the following testcase shows, libffi didn't handle properly
classify_arguments of structures at byte offsets not divisible by
UNITS_PER_WORD.  The following patch adjusts it to match what
config/i386/ classify_argument does for that and also ports the
PR38781 fix there (the second chunk).

This has been committed to upstream libffi already:
5651bea284

2021-06-16  Jakub Jelinek  <jakub@redhat.com>

	* src/x86/ffi64.c (classify_argument): For FFI_TYPE_STRUCT set words
	to number of words needed for type->size + byte_offset bytes rather
	than just type->size bytes.  Compute pos before the loop and check
	total size of the structure.
	* testsuite/libffi.call/nested_struct12.c: New test.
2021-06-16 10:45:27 +02:00
Piotr Trojanek
ccf0dee109 [Ada] Fix Is_Volatile_Function for functions declared in protected bodies
gcc/ada/

	* sem_util.adb (Is_Volatile_Function): Follow the exact wording
	of SPARK (regarding volatile functions) and Ada (regarding
	protected functions).
2021-06-16 04:43:05 -04:00
Piotr Trojanek
1a9ff8d39c [Ada] Ignore volatile restrictions in preanalysis
gcc/ada/

	* sem_util.adb (Is_OK_Volatile_Context): All references to
	volatile objects are legal in preanalysis.
	(Within_Volatile_Function): Previously it was wrongly called on
	Empty entities; now it is only called on E_Return_Statement,
	which allow the body to be greatly simplified.
2021-06-16 04:43:05 -04:00
Yannick Moy
3feba0a578 [Ada] Do not generate an Itype_Reference node for slices in GNATprove mode
gcc/ada/

	* sem_res.adb (Set_Slice_Subtype): Revert special-case
	introduced previously, which is not needed as Itypes created for
	slices are precisely always used.
2021-06-16 04:43:04 -04:00
Eric Botcazou
f4fe186bfe [Ada] Fix floating-point exponentiation with Integer'First exponent
gcc/ada/

	* urealp.adb (Scale): Change first paramter to Uint and adjust.
	(Equivalent_Decimal_Exponent): Pass U.Den directly to Scale.
	* libgnat/s-exponr.adb (Negative): Rename to...
	(Safe_Negative): ...this and change its lower bound.
	(Exponr): Adjust to above renaming and deal with Integer'First.
2021-06-16 04:43:04 -04:00
Piotr Trojanek
07b7dc09b2 [Ada] Fix detection of volatile expressions in restricted contexts
gcc/ada/

	* sem_res.adb (Flag_Effectively_Volatile_Objects): Detect also
	allocators within restricted contexts and not just entity names.
	(Resolve_Actuals): Remove duplicated code for detecting
	restricted contexts; it is now exclusively done in
	Is_OK_Volatile_Context.
	(Resolve_Entity_Name): Adapt to new parameter of
	Is_OK_Volatile_Context.
	* sem_util.ads, sem_util.adb (Is_OK_Volatile_Context): Adapt to
	handle contexts both inside and outside of subprogram call
	actual parameters.
	(Within_Subprogram_Call): Remove; now handled by
	Is_OK_Volatile_Context itself and its parameter.
2021-06-16 04:43:04 -04:00
Piotr Trojanek
207962b929 [Ada] Cleanup repeated calls in Sloc_Range
gcc/ada/

	* sinput.adb (Sloc_Range): Refactor several repeated calls to
	Sloc and two comparisons with No_Location.
2021-06-16 04:43:04 -04:00
Piotr Trojanek
cc9a7ae229 [Ada] Fix aliasing check for actual parameters passed by reference
gcc/ada/

	* checks.adb (Apply_Scalar_Range_Check): Fix handling of check depending
	on the parameter passing mechanism.  Grammar adjustment ("has"
	=> "have").
	(Parameter_Passing_Mechanism_Specified): Add a hyphen in a comment.
2021-06-16 04:43:03 -04:00
Piotr Trojanek
6dc7a8ab14 [Ada] Remove unused initialization with New_List
gcc/ada/

	* exp_ch3.adb (Build_Slice_Assignment): Remove unused
	initialization.
2021-06-16 04:43:03 -04:00
Piotr Trojanek
e027681d90 [Ada] Fix typos in all occurrences of "occuring" in GNAT
gcc/ada/

	* restrict.adb, sem_attr.adb, types.ads: Fix typos in
	"occuring"; refill comment as necessary.
2021-06-16 04:43:03 -04:00
Piotr Trojanek
7ef1d8e88b [Ada] Adapt Is_Actual_Parameter to also work for entry parameters
gcc/ada/

	* sem_util.ads (Is_Actual_Parameter): Update comment.
	* sem_util.adb (Is_Actual_Parameter): Also detect entry parameters.
2021-06-16 04:43:02 -04:00
Arnaud Charlet
37cd8d97f3 [Ada] Wrong reference to System.Tasking in expanded code
gcc/ada/

	* rtsfind.ads, libgnarl/s-taskin.ads, exp_ch3.adb, exp_ch4.adb,
	exp_ch6.adb, exp_ch9.adb, sem_ch6.adb: Move master related
	entities to the expander directly.
2021-06-16 04:43:02 -04:00
Piotr Trojanek
f7f37ed649 [Ada] Cleanup related to volatile objects in restricted contexts
gcc/ada/

	* sem_res.adb (Is_Assignment_Or_Object_Expression): Whitespace
	cleanup.
	(Is_Attribute_Expression): Prevent AST climbing from going to
	the root of the compilation unit.
2021-06-16 04:43:02 -04:00
Steve Baird
788fed4b39 [Ada] Include info about containers in GNAT RM Implementation Advice section
gcc/ada/

	* doc/gnat_rm/implementation_advice.rst: Add a section for RM
	A.18 .
	* gnat_rm.texi: Regenerate.
2021-06-16 04:43:01 -04:00
Justin Squirek
e66167fb49 [Ada] Mixing of positional and named entries allowed in enum rep
gcc/ada/

	* sem_ch13.adb (Analyze_Enumeration_Representation_Clause): Add
	check for the mixing of entries.
2021-06-16 04:43:01 -04:00
Justin Squirek
c5dc00ef38 [Ada] Non-static Interrupt_Priority allowed with restriction Static_Priorities
gcc/ada/

	* sem_ch13.adb (Make_Aitem_Pragma): Check for static expressions
	in Priority aspect arguments for restriction Static_Priorities.
2021-06-16 04:43:01 -04:00
Justin Squirek
f5b4b6bf14 [Ada] Spurious accessibility error on "for of" loop parameter
gcc/ada/

	* sem_util.adb (Accessibility_Level): Take into account
	renamings of loop parameters.
2021-06-16 04:43:00 -04:00
Matthieu Eyraud
7626537ae7 [Ada] Fix ALI source location for dominance markers
gcc/ada/

	* par_sco.adb (Set_Statement_Entry): Change sloc for dominance
	marker.
	(Traverse_One): Fix typo.
	(Output_Header): Fix comment.
2021-06-16 04:43:00 -04:00
Richard Kenner
ff4746bcde [Ada] Don't look for aliases for generic subprograms
gcc/ada/

	* exp_unst.adb (Register_Subprogram): Don't look for aliases for
	subprograms that are generic.  Reorder tests for efficiency.
2021-06-16 04:43:00 -04:00
Eric Botcazou
e505bf515f [Ada] Make Incomplete_Or_Partial_View independent of the context
gcc/ada/

	* sem_util.adb (Incomplete_Or_Partial_View): Retrieve the scope of
	the parameter and use it to find its incomplete view, if any.
2021-06-16 04:43:00 -04:00
Eric Botcazou
5c44cc1c73 [Ada] Do not perform useless work in Check_No_Parts_Violations
gcc/ada/

	* freeze.adb (Check_No_Parts_Violations): Return earlier if the
	type is elementary or does not come from source.
2021-06-16 04:42:59 -04:00