Commit Graph

190836 Commits

Author SHA1 Message Date
GCC Administrator
d9450aa0e8 Daily bump. 2022-01-11 00:16:36 +00:00
Uros Bizjak
04a7455560 i386: Introduce V2QImode vector compares [PR103861]
Add V2QImode vector compares with SSE registers.

2022-01-10  Uroš Bizjak  <ubizjak@gmail.com>

gcc/ChangeLog:

	PR target/103861
	* config/i386/i386-expand.c (ix86_expand_int_sse_cmp):
	Handle V2QImode.
	* config/i386/mmx.md (<sat_plusminus:insn><mode>3):
	Use VI1_16_32 mode iterator.
	(*eq<mode>3): Ditto.
	(*gt<mode>3): Ditto.
	(*xop_maskcmp<mode>3): Ditto.
	(*xop_maskcmp_uns<mode>3): Ditto.
	(vec_cmp<mode><mode>): Ditto.
	(vec_cmpu<mode><mode>): Ditto.

gcc/testsuite/ChangeLog:

	PR target/103861
	* gcc.target/i386/pr103861-2.c: New test.
2022-01-10 21:00:07 +01:00
Patrick Palka
ab36b554bd c++: constexpr base-to-derived conversion with offset 0 [PR103879]
r12-136 made us canonicalize an object/offset pair with negative offset
into one with a nonnegative offset, by iteratively absorbing the
innermost component into the offset and stopping as soon as the offset
becomes nonnegative.

This patch strengthens this transformation by making it keep on absorbing
even if the offset is already 0 as long as the innermost component is at
position 0 (and thus absorbing doesn't change the offset).  This lets us
accept the two constexpr testcases below, which we'd previously reject
essentially because cxx_fold_indirect_ref would be unable to resolve
*(B*)&b.D123 (where D123 is the base A subobject at position 0) to just b.

	PR c++/103879

gcc/cp/ChangeLog:

	* constexpr.c (cxx_fold_indirect_ref): Split out object/offset
	canonicalization step into a local lambda.  Strengthen it to
	absorb more components at position 0.  Use it before both calls
	to cxx_fold_indirect_ref_1.

gcc/testsuite/ChangeLog:

	* g++.dg/cpp1y/constexpr-base2.C: New test.
	* g++.dg/cpp1y/constexpr-base2a.C: New test.
2022-01-10 14:57:54 -05:00
Patrick Palka
3e95a974c3 c++: "more constrained" vs staticness of memfn [PR103783]
Here we're rejecting the calls to g1 and g2 as ambiguous even though one
overload is more constrained than the other (and they're otherwise tied),
because the implicit 'this' parameter of the non-static overload causes
cand_parms_match to think the function parameter lists aren't equivalent.

This patch fixes this by making cand_parms_match skip over 'this'
appropriately.  Note that this bug only affects partial ordering of
non-template member functions because for member function templates
more_specialized_fn seems to already skip over 'this' appropriately.

	PR c++/103783

gcc/cp/ChangeLog:

	* call.c (cand_parms_match): Skip over 'this' when given one
	static and one non-static member function.

gcc/testsuite/ChangeLog:

	* g++.dg/cpp2a/concepts-memfun2.C: New test.
2022-01-10 14:57:51 -05:00
Jakub Jelinek
54fa7daefe c++: Ensure some more that immediate functions aren't gimplified [PR103912]
Immediate functions should never be emitted into assembly, the FE doesn't
genericize them and does various things to ensure they aren't gimplified.
But the following testcase ICEs anyway due to that, because the consteval
function returns a lambda, and operator() of the lambda has
decl_function_context of the consteval function.  cgraphunit.c then
does:
              /* Preserve a functions function context node.  It will
                 later be needed to output debug info.  */
              if (tree fn = decl_function_context (decl))
                {
                  cgraph_node *origin_node = cgraph_node::get_create (fn);
                  enqueue_node (origin_node);
                }
which enqueues the immediate function and then tries to gimplify it,
which results in ICE because it hasn't been genericized.

When I try similar testcase with constexpr instead of consteval and
static constinit auto instead of auto in main, what happens is that
the functions are gimplified, later ipa.c discovers they aren't reachable
and sets body_removed to true for them (and clears other flags) and we end
up with a debug info which has the foo and bar functions without
DW_AT_low_pc and other code specific attributes, just stuff from its BLOCK
structure and in there the lambda with DW_AT_low_pc etc.

The following patch attempts to emulate that behavior early, so that cgraph
doesn't try to gimplify those and pretends they were already gimplified
and found unused and optimized away.

2022-01-10  Jakub Jelinek  <jakub@redhat.com>

	PR c++/103912
	* semantics.c (expand_or_defer_fn): For immediate functions, set
	node->body_removed to true and clear analyzed, definition and
	force_output.
	* decl2.c (c_parse_final_cleanups): Ignore immediate functions for
	expand_or_defer_fn.

	* g++.dg/cpp2a/consteval26.C: New test.
2022-01-10 20:49:11 +01:00
Uros Bizjak
de0faa56a1 tree-optimization/103948 - detect vector vec_cmp in expand_vector_condition
Currently, expand_vector_condition detects only vcondMN and vconduMN
named RTX patterns.  Teach it to also consider vec_cmpMN and vec_cmpuMN
RTX patterns when all ones vector is returned for true and all zeros vector
is returned for false.

2022-01-10  Richard Biener  <rguenther@suse.de>

gcc/ChangeLog:

	PR tree-optimization/103948
	* tree-vect-generic.c (expand_vector_condition): Return true if
	all ones vector is returned for true, all zeros vector for false
	and the target defines corresponding vec_cmp{,u}MN named RTX pattern.
2022-01-10 20:40:22 +01:00
Paul A. Clarke
c173d880d6 rs6000: Add Power10 optimization for _mm_blendv*
Power10 ISA added `xxblendv*` instructions which are realized in the
`vec_blendv` instrinsic.

Use `vec_blendv` for `_mm_blendv_epi8`, `_mm_blendv_ps`, and
`_mm_blendv_pd` compatibility intrinsics, when `_ARCH_PWR10`.

Update original implementation of _mm_blendv_epi8 to use signed types,
to better match the function parameters. Realization is unchanged.

Also, copy a test from i386 for testing `_mm_blendv_ps`.
This should have come with commit ed04cf6d73,
but was inadvertently omitted.

2022-01-10  Paul A. Clarke  <pc@us.ibm.com>

gcc
	* config/rs6000/smmintrin.h (_mm_blendv_epi8): Use vec_blendv
	when _ARCH_PWR10. Use signed types.
	(_mm_blendv_ps): Use vec_blendv when _ARCH_PWR10.
	(_mm_blendv_pd): Likewise.

gcc/testsuite
	* gcc.target/powerpc/sse4_1-blendvps.c: Copy from gcc.target/i386,
	adjust dg directives to suit.
2022-01-10 12:16:33 -06:00
Andre Vieira
d3ff7420e9 [vect] Re-analyze all modes for epilogues
gcc/ChangeLog:

	* tree-vectorizer.c (better_epilogue_loop_than_p): Round factors up for
	epilogue costing.
	* tree-vect-loop.c (vect_analyze_loop): Re-analyze all modes for
	epilogues, unless we are guaranteed that we can't have partial vectors.
	* genopinit.c: (partial_vectors_supported): Generate new function.

gcc/testsuite/ChangeLog:

	* gcc.target/aarch64/masked_epilogue.c: New test.
2022-01-10 17:54:33 +00:00
Paul Thomas
828474fafd Fortran: Pass unlimited polymorphic argument to assumed type [PR103366].
2022-01-10  Paul Thomas  <pault@gcc.gnu.org>

gcc/fortran
	PR fortran/103366
	* trans-expr.c (gfc_conv_gfc_desc_to_cfi_desc): Allow unlimited
	polymorphic actual argument passed to assumed type formal.

gcc/testsuite/
	PR fortran/103366
	* gfortran.dg/pr103366.f90: New test.
2022-01-10 16:54:53 +00:00
Jakub Jelinek
3159da6c46 x86_64: Ignore zero width bitfields in ABI and issue -Wpsabi warning about C zero width bitfield ABI changes [PR102024]
For zero-width bitfields current GCC classify_argument does:
                  if (DECL_BIT_FIELD (field))
                    {
                      for (i = (int_bit_position (field)
                                + (bit_offset % 64)) / 8 / 8;
                           i < ((int_bit_position (field) + (bit_offset % 64))
                                + tree_to_shwi (DECL_SIZE (field))
                                + 63) / 8 / 8; i++)
                        classes[i]
                          = merge_classes (X86_64_INTEGER_CLASS, classes[i]);
                    }
which I think means that if the zero-width bitfields are at bit-positions
(in the toplevel aggregate) which are multiples of 64 bits doesn't do
anything, (int_bit_position (field) + (bit_offset % 64)) / 64 and
(int_bit_position (field) + (bit_offset % 64) + 63) / 64 should be equal.
But for zero-width bitfields at other bit positions it will call
merge_classes once.  Now, the typical case is that the zero width bitfield
is surrounded by some bitfields and in that case, it doesn't change
anything, but it can be sandwitched in between floats too as the testcases
show.
In C we had this behavior, in C++ previously the FE was removing the
zero-width bitfields and therefore they were ignored.
LLVM and ICC seems to ignore those bitfields both in C and C++ (== passing
struct S in SSE register rather than in GPR).

The x86-64 psABI has been recently clarified by
1aa4398d26
that zero width bitfield should be always ignored.

This patch implements that and emits a warning for C for cases where the ABI
changed from GCC 11.

2022-01-10  Jakub Jelinek  <jakub@redhat.com>

	PR target/102024
	* config/i386/i386.c (classify_argument): Add zero_width_bitfields
	argument, when seeing DECL_FIELD_CXX_ZERO_WIDTH_BIT_FIELD bitfields,
	always ignore them, when seeing other zero sized bitfields, either
	set zero_width_bitfields to 1 and ignore it or if equal to 2 process
	it.  Pass it to recursive calls.  Add wrapper
	with old arguments and diagnose ABI differences for C structures
	with zero width bitfields.  Formatting fixes.

	* gcc.target/i386/pr102024.c: New test.
	* g++.target/i386/pr102024.C: New test.
2022-01-10 17:43:23 +01:00
Martin Liska
b6eac7c4fb Partially sort MAINTAINERS.
ChangeLog:

	* MAINTAINERS: Fix obvious issues with sorting.
2022-01-10 17:10:52 +01:00
Richard Sandiford
037cc0b4a6 ira: Handle "soft" conflicts between cap and non-cap allocnos
This patch looks for allocno conflicts of the following form:

- One allocno (X) is a cap allocno for some non-cap allocno X2.
- X2 belongs to some loop L2.
- The other allocno (Y) is a non-cap allocno.
- Y is an ancestor of some allocno Y2 in L2.
- Y2 is not referenced in L2 (that is, ALLOCNO_NREFS (Y2) == 0).
- Y can use a different allocation from Y2.

In this case, Y's register is live across L2 but is not used within it,
whereas X's register is used only within L2.  The conflict is therefore
only "soft", in that it can easily be avoided by spilling Y2 inside L2
without affecting any insn references.

In principle we could do this for ALLOCNO_NREFS (Y2) != 0 too, with the
callers then taking Y2's ALLOCNO_MEMORY_COST into account.  There would
then be no "cliff edge" between a Y2 that has no references and a Y2 that
has (say) a single cold reference.

However, doing that isn't necessary for the PR and seems to give
variable results in practice.  (fotonik3d_r improves slightly but
namd_r regresses slightly.)  It therefore seemed better to start
with the higher-value zero-reference case and see how things go.

On top of the previous patches in the series, this fixes the exchange2
regression seen in GCC 11.

gcc/
	PR rtl-optimization/98782
	* ira-int.h (ira_soft_conflict): Declare.
	* ira-color.c (max_soft_conflict_loop_depth): New constant.
	(ira_soft_conflict): New function.
	(spill_soft_conflicts): Likewise.
	(assign_hard_reg): Use them to handle the case described by
	the comment above ira_soft_conflict.
	(improve_allocation): Likewise.
	* ira.c (check_allocation): Allow allocnos with "soft" conflicts
	to share the same register.

gcc/testsuite/
	* gcc.target/aarch64/reg-alloc-4.c: New test.
2022-01-10 14:47:09 +00:00
Richard Sandiford
01f3e6a40e ira: Consider modelling caller-save allocations as loop spills
If an allocno A in an inner loop L spans a call, a parent allocno AP
can choose to handle a call-clobbered/caller-saved hard register R
in one of two ways:

(1) save R before each call in L and restore R after each call
(2) spill R to memory throughout L

(2) can be cheaper than (1) in some cases, particularly if L does
not reference A.

Before the patch we always did (1).  The patch adds support for
picking (2) instead, when it seems cheaper.  It builds on the
earlier support for not propagating conflicts to parent allocnos.

gcc/
	PR rtl-optimization/98782
	* ira-int.h (ira_caller_save_cost): New function.
	(ira_caller_save_loop_spill_p): Likewise.
	* ira-build.c (ira_propagate_hard_reg_costs): Test whether it is
	cheaper to spill a call-clobbered register throughout a loop rather
	than spill it around each individual call.  If so, treat all
	call-clobbered registers as conflicts and...
	(propagate_allocno_info): ...do not propagate call information
	from the child to the parent.
	* ira-color.c (move_spill_restore): Update accordingly.
	* ira-costs.c (ira_tune_allocno_costs): Use ira_caller_save_cost.

gcc/testsuite/
	* gcc.target/aarch64/reg-alloc-3.c: New test.
2022-01-10 14:47:08 +00:00
Richard Sandiford
8e7a23728f ira: Try to avoid propagating conflicts
Suppose that:

- an inner loop L contains an allocno A
- L clobbers hard register R while A is live
- A's parent allocno is AP

Previously, propagate_allocno_info would propagate conflict sets up the
loop tree, so that the conflict between A and R would become a conflict
between AP and R (and so on for ancestors of AP).

However, when IRA treats loops as separate allocation regions, it can
decide on a loop-by-loop basis whether to allocate a register or spill
to memory.  Conflicts in inner loops therefore don't need to become
hard conflicts in parent loops.  Instead we can record that using the
“conflicting” registers for the parent allocnos has a higher cost.
In the example above, this higher cost is the sum of:

- the cost of saving R on entry to L
- the cost of keeping the pseudo register in memory throughout L
- the cost of reloading R on exit from L

This value is also a cap on the hard register cost that A can contribute
to AP in general (not just for conflicts).  Whatever allocation we pick
for AP, there is always the option of spilling that register to memory
throughout L, so the cost to A of allocating a register to AP can't be
more than the cost of spilling A.

To take an extreme example: if allocating a register R2 to A is more
expensive than spilling A to memory, ALLOCNO_HARD_REG_COSTS (A)[R2]
could be (say) 2 times greater than ALLOCNO_MEMORY_COST (A) or 100
times greater than ALLOCNO_MEMORY_COST (A).  But this scale factor
doesn't matter to AP.  All that matters is that R2 is more expensive
than memory for A, so that allocating R2 to AP should be costed as
spilling A to memory (again assuming that A and AP are in different
allocation regions).  Propagating a factor of 100 would distort the
register costs for AP.

move_spill_restore tries to undo the propagation done by
propagate_allocno_info, so we need some extra processing there.

gcc/
	PR rtl-optimization/98782
	* ira-int.h (ira_allocno::might_conflict_with_parent_p): New field.
	(ALLOCNO_MIGHT_CONFLICT_WITH_PARENT_P): New macro.
	(ira_single_region_allocno_p): New function.
	(ira_total_conflict_hard_regs): Likewise.
	* ira-build.c (ira_create_allocno): Initialize
	ALLOCNO_MIGHT_CONFLICT_WITH_PARENT_P.
	(ira_propagate_hard_reg_costs): New function.
	(propagate_allocno_info): Use it.  Try to avoid propagating
	hard register conflicts to parent allocnos if we can handle
	the conflicts by spilling instead.  Limit the propagated
	register costs to the cost of spilling throughout the child loop.
	* ira-color.c (color_pass): Use ira_single_region_allocno_p to
	test whether a child and parent allocno can share the same
	register.
	(move_spill_restore): Adjust for the new behavior of
	propagate_allocno_info.

gcc/testsuite/
	* gcc.target/aarch64/reg-alloc-2.c: New test.
2022-01-10 14:47:08 +00:00
Richard Sandiford
d54565d87f ira: Add ira_subloop_allocnos_can_differ_p
color_pass has two instances of the same code for propagating non-cap
assignments from parent loops to subloops.  This patch adds a helper
function for testing when such propagations are required for correctness
and uses it to remove the duplicated code.

A later patch will use this in ira-build.c too, which is why the
function is exported to ira-int.h.

No functional change intended.

gcc/
	PR rtl-optimization/98782
	* ira-int.h (ira_subloop_allocnos_can_differ_p): New function,
	extracted from...
	* ira-color.c (color_pass): ...here.
2022-01-10 14:47:07 +00:00
Richard Sandiford
909a4b4764 ira: Add comments and fix move_spill_restore calculation
This patch adds comments to describe each use of ira_loop_border_costs.
I think this highlights that move_spill_restore was using the wrong cost
in one case, which came from tranposing [0] and [1] in the original
(pre-ira_loop_border_costs) ira_memory_move_cost expressions.  The
difference would only be noticeable on targets that distinguish between
load and store costs.

gcc/
	PR rtl-optimization/98782
	* ira-color.c (color_pass): Add comments to describe the spill costs.
	(move_spill_restore): Likewise.  Fix reversed calculation.
2022-01-10 14:47:07 +00:00
Richard Sandiford
bf37fd35a3 ira: Add a ira_loop_border_costs class
The final index into (ira_)memory_move_cost is 1 for loads and
0 for stores.  Thus the combination:

  entry_freq * memory_cost[1] + exit_freq * memory_cost[0]

is the cost of loading a register on entry to a loop and
storing it back on exit from the loop.  This is the cost to
use if the register is successfully allocated within the
loop but is spilled in the parent loop.  Similarly:

  entry_freq * memory_cost[0] + exit_freq * memory_cost[1]

is the cost of storing a register on entry to the loop and
restoring it on exit from the loop.  This is the cost to
use if the register is spilled within the loop but is
successfully allocated in the parent loop.

The patch adds a helper class for calculating these values and
mechanically replaces the existing instances.  There is no attempt to
editorialise the choice between using “spill inside” and “spill outside”
costs.  (I think one of them is the wrong way round, but a later patch
deals with that.)

No functional change intended.

gcc/
	PR rtl-optimization/98782
	* ira-int.h (ira_loop_border_costs): New class.
	* ira-color.c (ira_loop_border_costs::ira_loop_border_costs):
	New constructor.
	(calculate_allocno_spill_cost): Use ira_loop_border_costs.
	(color_pass): Likewise.
	(move_spill_restore): Likewise.
2022-01-10 14:47:07 +00:00
Jakub Jelinek
a8d3c98746 libstdc++: Add %j, %U, %w, %W time_get support, fix %y, %Y, %C, %p [PR77760]
glibc strptime passes around some state, what fields in struct tm have been
set and what needs to be finalized through possibly recursive calls, and
at the end performs various finalizations, like applying %p so that it
works for both %I %p and %p %I orders, or applying century so that both
%C %y and %y %C works, or computation of missing fields from others
(e.g. from %Y and %j one can compute tm_mon, tm_mday and tm_wday,
from %Y %U %w, %Y %W %w, %Y %U %a, or %Y %W %w one can compute
tm_mon, tm_mday, tm_yday or e.g. from %Y %m %d one can compute tm_wday
and tm_yday.

As the finalization is quite large and doesn't need to be a template
(doesn't depend on any iterators or char types), I've put it into libstdc++,
and left some padding in the state struct, so that perhaps in the future we
can track some more state without changing ABI.

Unfortunately, there is an ugly problem that the standard mandates that
get method calls the do_get virtual method and I don't see how we can
cary on any state in between those calls (even if we did an ABI change
for the facets, the methods are const, so that I think multiple threads
could use the same time_get objects and we couldn't store state in there).

There is a hack for that for GCC (seems to work with ICC too, doesn't work
with clang++) if the do_get method isn't overriden we can pass the state
around.

For both do_get_year and per IRC discussions also for %y, the behavior is
if 1-2 digits are parsed, the year is treated according to POSIX 2008 %y
rules (0-68 is 2000-2068, 69-99 is 1969-1999), if 3-4 digits are parsed,
it is treated as %Y.

2022-01-10  Jakub Jelinek  <jakub@redhat.com>

	PR libstdc++/77760
	* include/bits/locale_facets_nonio.h (__time_get_state): New struct.
	(time_get::_M_extract_via_format): Declare new method with
	__time_get_state& as an extra argument.
	* include/bits/locale_facets_nonio.tcc (_M_extract_via_format): Add
	__state argument, set various fields in it while parsing.  Handle %j,
	%U, %w and %W, fix up handling of %y, %Y and %C, don't adjust tm_hour
	for %p immediately.  Add a wrapper around the method without the
	__state argument for backwards compatibility.
	(_M_extract_num): Remove all __len == 4 special cases.
	(time_get::do_get_time, time_get::do_get_date, time_get::do_get): Zero
	initialize __state, pass it to _M_extract_via_format and finalize it
	at the end.
	(do_get_year): For 1-2 digit parsed years, map 0-68 to 2000-2068,
	69-99 to 1969-1999.  For 3-4 digit parsed years use that as year.
	(get): If do_get isn't overloaded from the locale_facets_nonio.tcc
	version, don't call do_get but call _M_extract_via_format instead to
	pass around state.
	* config/abi/pre/gnu.ver (GLIBCXX_3.4.30): Export _M_extract_via_format
	with extra __time_get_state and __time_get_state::_M_finalize_state.
	* src/c++98/locale_facets.cc (is_leap, day_of_the_week,
	day_of_the_year): New functions in anon namespace.
	(mon_yday): New var in anon namespace.
	(__time_get_state::_M_finalize_state): Define.
	* testsuite/22_locale/time_get/get/char/4.cc: New test.
	* testsuite/22_locale/time_get/get/wchar_t/4.cc: New test.
	* testsuite/22_locale/time_get/get_year/char/1.cc (test01): Parse 197
	as year 197AD instead of error.
	* testsuite/22_locale/time_get/get_year/char/5.cc (test01): Parse 1 as
	year 2001 instead of error.
	* testsuite/22_locale/time_get/get_year/char/6.cc: New test.
	* testsuite/22_locale/time_get/get_year/wchar_t/1.cc (test01): Parse
	197 as year 197AD instead of error.
	* testsuite/22_locale/time_get/get_year/wchar_t/5.cc (test01): Parse
	1 as year 2001 instead of error.
	* testsuite/22_locale/time_get/get_year/wchar_t/6.cc: New test.
2022-01-10 15:38:47 +01:00
Jonathan Wakely
68c2e9e923 libstdc++: Fix and simplify freestanding configuration [PR103866]
This fixes the --disable-hosted-libstdcxx build so that it works with
--without-headers. Currently you need to also use --with-newlib, which
is confusing for users who aren't actually using newlib.

The AM_PROG_LIBTOOL checks are currently skipped for --with-newlib and
--with-avrlibc builds, with this change they are also skipped when using
--without-headers.  It would be nice if using --disable-hosted-libstdcxx
automatically skipped those checks, but GLIBCXX_ENABLE_HOSTED comes too
late to make the AM_PROG_LIBTOOL checks depend on $is_hosted.

The checks for EOF, SEEK_CUR etc. cause the build to fail if there is no
<stdio.h> available.  Unlike most headers, which get a HAVE_FOO_H macro,
<stdio.h> is in autoconf's default includes, so every check tries to
include it unconditionally. This change skips those checks for
freestanding builds.

Similarly, the checks for <stdint.h> types done by GCC_HEADER_STDINT try
to include <stdio.h> and fail for --without-headers builds. This change
skips the use of GCC_HEADER_STDINT for freestanding. We can probably
stop using GCC_HEADER_STDINT entirely, since only one file uses the
gstdint.h header that is generated, and that could easily be changed to
use <stdint.h> instead. That can wait for stage 1.

We also need to skip the GLIBCXX_CROSSCONFIG stage if --without-headers
was used, since we don't have any of the functions it deals with.

The end result of the changes above is that it should not be necessary
for a --disable-hosted-libstdcxx --without-headers build to also use
--with-newlib.

Finally, compile libsupc++ with -ffreestanding when --without-headers is
used, so that <stdint.h> will use <gcc-stdint.h> instead of expecting it
to come from libc.

libstdc++-v3/ChangeLog:

	PR libstdc++/103866
	* acinclude.m4 (GLIBCXX_COMPUTE_STDIO_INTEGER_CONSTANTS): Do
	nothing for freestanding builds.
	(GLIBCXX_ENABLE_HOSTED): Define FREESTANDING_FLAGS.
	* configure.ac: Do not use AC_LIBTOOL_DLOPEN when configured
	with --without-headers.  Do not use GCC_HEADER_STDINT for
	freestanding builds.
	* libsupc++/Makefile.am (HOSTED_CXXFLAGS): Use -ffreestanding
	for freestanding builds.
	* configure: Regenerate.
	* Makefile.in: Regenerate.
	* doc/Makefile.in: Regenerate.
	* include/Makefile.in: Regenerate.
	* libsupc++/Makefile.in: Regenerate.
	* po/Makefile.in: Regenerate.
	* python/Makefile.in: Regenerate.
	* src/Makefile.in: Regenerate.
	* src/c++11/Makefile.in: Regenerate.
	* src/c++17/Makefile.in: Regenerate.
	* src/c++20/Makefile.in: Regenerate.
	* src/c++98/Makefile.in: Regenerate.
	* src/filesystem/Makefile.in: Regenerate.
	* testsuite/Makefile.in: Regenerate.
2022-01-10 12:18:14 +00:00
Jonathan Wakely
e54dda45f9 libstdc++: Add dg-timeout-factor to some more regex tests
I'm seeing these fail with tool_timeout=30 on a busy machine.

libstdc++-v3/ChangeLog:

	* testsuite/28_regex/algorithms/regex_replace/char/103664.cc:
	Add dg-timeout-factor directive.
	* testsuite/28_regex/basic_regex/84110.cc: Likewise.
	* testsuite/28_regex/basic_regex/ctors/char/other.cc: Likewise.
	* testsuite/28_regex/match_results/102667.cc: Likewise.
2022-01-10 12:18:14 +00:00
Jonathan Wakely
e1b8a91e47 libstdc++: Update default -std option in manual
libstdc++-v3/ChangeLog:

	* doc/xml/manual/using.xml: Update documentation around default
	-std option.
	* doc/html/*: Regenerate.
2022-01-10 12:18:14 +00:00
Jonathan Wakely
4fde88e5dd libstdc++: Add -nostdinc++ for c++17 sources [PR100017]
When building a build!=host compiler, the just-built gcc can't be used
to build the target libstdc++ (because it is built for the host triplet,
not the build triplet). The top-level configure.ac sets up the build
flags for libstdc++ (and other "raw_cxx" libs) like this:

GCC_TARGET_TOOL(c++ for libstdc++, RAW_CXX_FOR_TARGET, CXX,
		[gcc/xgcc -shared-libgcc -B$$r/$(HOST_SUBDIR)/gcc -nostdinc++ -L$$r/$(TARGET_SUBDIR)/libstdc++-v3/src -L$$r/$(TARGET_SUBDIR)/libstdc++-v3/src/.libs -L$$r/$(TARGET_SUBDIR)/libstdc++-v3/libsupc++/.libs],
		c++)

The -nostdinc++ flag is only used for the IN-TREE-TOOL, i.e. when using
the just-built gcc/xgcc compiler. This means that the cross-compiler
used to build libstdc++ will add its own libstdc++ headers to the
include path. That results in the #include <cfenv> in
src/c++17/floating_to_chars.cc and src/c++17/floating_from_chars.cc
doing #include_next <fenv.h> and finding the libstdc++ fenv.h wrapper
from the host compiler. Because that has the same include guard as the
<fenv.h> in the libstdc++ we're trying to build, we never reach the
underlying <fenv.h> from libc. That results in several errors of the
form:

error: 'fenv_t' has not been declared in '::'

The most correct fix would be to add -nostdinc++ to the
RAW_CXX_FOR_TARGET variable in configure.ac, or the
RAW_CXX_TARGET_EXPORTS variable in Makefile.tpl.

Another solution would be to make the libstdc++ <fenv.h> wrapper use
_GLIBCXX_INCLUDE_NEXT_C_HEADERS like our <stdlib.h> and other C header
wrappers.

For now though, the simplest and safest solution is to just add
-nostdinc++ to the CXXFLAGS used for src/c++17/*.cc, which is what this
does.

libstdc++-v3/ChangeLog:

	PR libstdc++/100017
	* src/c++17/Makefile.am (AM_CXXFLAGS): Add -nostdinc++.
	* src/c++17/Makefile.in: Regenerate.
2022-01-10 12:18:13 +00:00
Eric Botcazou
8234b0dcb2 Properly enable -freorder-blocks-and-partition on 64-bit Windows
The PR uncovered that -freorder-blocks-and-partition was working by accident
on 64-bit Windows, i.e. the middle-end was supposed to disable it with SEH.
After the change installed on mainline, the middle-end properly disables it,
which is too bad since a significant amount of work went into it for SEH.

gcc/
	PR target/103465
	* coretypes.h (unwind_info_type): Swap UI_SEH and UI_TARGET.
2022-01-10 12:44:28 +01:00
Francois-Xavier Coudert
492954263e Fortran: Allow IEEE_CLASS to identify signaling NaNs
We use the issignaling macro, present in some libc's (notably glibc),
when it is available. Compile all IEEE-related files in the library
(both C and Fortran sources) with -fsignaling-nans to ensure maximum
compatibility.

libgfortran/ChangeLog:

	PR fortran/82207
	* Makefile.am: Pass -fsignaling-nans for IEEE files.
	* Makefile.in: Regenerate.
	* ieee/ieee_helper.c: Use issignaling macro to recognized
	signaling NaNs.

gcc/testsuite/ChangeLog:

	PR fortran/82207
	* gfortran.dg/ieee/signaling_1.f90: New test.
	* gfortran.dg/ieee/signaling_1_c.c: New file.
2022-01-10 12:28:46 +01:00
Richard Biener
be59671c56 middle-end/101530 - fix shufflevector lowering
This makes __builtin_shufflevector lowering force the result
of the BIT_FIELD_REF lowpart operation to a temporary as to
fulfil the IL verifier constraint that BIT_FIELD_REFs should
be always in outermost handled component position.  Trying to
enforce this during gimplification isn't as straight-forward
as here where we know we're dealing with an rvalue.

FAIL: c-c++-common/torture/builtin-shufflevector-1.c   -O0  execution test

2022-01-05  Richard Biener  <rguenther@suse.de>

	PR middle-end/101530
gcc/c-family/
	* c-common.c (c_build_shufflevector): Wrap the BIT_FIELD_REF
	in a TARGET_EXPR to force a temporary.

gcc/testsuite/
	* c-c++-common/builtin-shufflevector-3.c: New testcase.
2022-01-10 11:29:43 +01:00
Richard Biener
92e114d66e tree-optimization/100359 - restore unroll at -O3
This fixes a mistake done with r8-5008 when introducing
allow_peel to the unroll code.  The intent was to allow
peeling that doesn't grow code but the result was that
with -O3 and UL_ALL this wasn't done.  The following
instantiates the desired effect by adjusting ul to UL_NO_GROWTH
if peeling is not allowed.

2022-01-05  Richard Biener  <rguenther@suse.de>

	PR tree-optimization/100359
	* tree-ssa-loop-ivcanon.c (try_unroll_loop_completely):
	Allow non-growing peeling with !allow_peel and UL_ALL.

	* gcc.dg/tree-ssa/pr100359.c: New testcase.
2022-01-10 11:08:42 +01:00
Eric Botcazou
a42dd9febb [Ada] Fix bogus error on call to subprogram with incomplete profile
gcc/ada/

	* gcc-interface/trans.c (Identifier_to_gnu): Use correct subtype.
	(elaborate_profile): New function.
	(Call_to_gnu): Call it on the formals and the result type before
	retrieving the translated result type from the subprogram type.
2022-01-10 09:38:47 +00:00
Eric Botcazou
cc9cd23249 [Ada] Fix internal error on unchecked union with component clauses
gcc/ada/

	* gcc-interface/decl.c (gnat_to_gnu_entity) <E_Record_Type>: Fix
	computation of boolean result in the unchecked union case.
	(components_to_record): Rename MAYBE_UNUSED parameter to IN_VARIANT
	and remove local variable of the same name.  Pass NULL recursively
	as P_GNU_REP_LIST for nested variants in the unchecked union case.
2022-01-10 09:38:46 +00:00
Eric Botcazou
0c6fbbfc83 [Ada] Make pragma Inspection_Point work for constants
gcc/ada/

	* gcc-interface/trans.c (lvalue_required_p) <N_Pragma>: New case.
	<N_Pragma_Argument_Association>: Likewise.
	(Pragma_to_gnu) <Pragma_Inspection_Point>: Fetch the corresponding
	variable of a constant before marking it as addressable.
2022-01-10 09:38:46 +00:00
Arnaud Charlet
a6eae6a9bb [Ada] Reduce runtime dependencies on stage1
gcc/ada/

	* gcc-interface/Make-lang.in (ADA_GENERATED_FILES): Remove
	s-casuti.ad?, s-crtl.ad?, s-os_lib.ad?.  Update list of object
	files accordingly.
2022-01-10 09:38:46 +00:00
Piotr Trojanek
41899cd372 [Ada] Switch from __sync to __atomic builtins for Lock_Free_Try_Write
gcc/ada/

	* libgnat/s-atopri.ads (Atomic_Compare_Exchange): Replaces
	deprecated Sync_Compare_And_Swap.
	* libgnat/s-atopri.adb (Lock_Free_Try_Write): Switch from __sync
	to __atomic builtins.
2022-01-10 09:38:45 +00:00
Piotr Trojanek
888fb69365 [Ada] Remove CodePeer annotations for pragma Loop_Variant
gcc/ada/

	* libgnat/s-exponn.adb, libgnat/s-expont.adb,
	libgnat/s-exponu.adb, libgnat/s-widthi.adb,
	libgnat/s-widthu.adb: Remove CodePeer annotations for pragma
	Loop_Variant.
2022-01-10 09:38:45 +00:00
Piotr Trojanek
d9c64c6040 [Ada] Disable expansion of pragma Loop_Variant in CodePeer mode
gcc/ada/

	* exp_prag.adb (Expand_Pragma_Loop_Variant): Disable expansion
	in CodePeer mode.
2022-01-10 09:38:45 +00:00
Piotr Trojanek
d256274430 [Ada] Fix typo in comment about unit families
gcc/ada/

	* sem_util.adb (Is_Child_Or_Sibling): Fix typo in comment.
2022-01-10 09:38:45 +00:00
Eric Botcazou
a283cf62e4 [Ada] Adjust the alignment to the size for bit-packed arrays
gcc/ada/

	* exp_pakd.adb (Install_PAT): If the PAT is a scalar type, apply
	the canonical adjustment to its alignment.
2022-01-10 09:38:44 +00:00
Piotr Trojanek
ad85af8e5a [Ada] Switch from __sync to __atomic builtins for atomic counters
gcc/ada/

	* libgnat/s-atocou__builtin.adb (Decrement, Increment): Switch
	from __sync to __atomic builtins; use 'Address to be consistent
	with System.Atomic_Primitives.
2022-01-10 09:38:44 +00:00
Eric Botcazou
68adddccb1 [Ada] Fix error on too large size clause for bit-packed array
gcc/ada/

	* exp_pakd.adb (Install_PAT): Do not reset the alignment here.
	* layout.adb (Layout_Type): Call Adjust_Esize_Alignment after having
	copied the RM_Size onto the Esize when the latter is too small.
2022-01-10 09:38:44 +00:00
Justin Squirek
b942847f78 [Ada] Task arrays trigger spurious unreferenced warnings
gcc/ada/

	* sem_warn.adb (Check_References): Handle arrays of tasks
	similar to task objects.
2022-01-10 09:38:43 +00:00
GCC Administrator
3a5702df3f Daily bump. 2022-01-10 00:16:20 +00:00
Harald Anlauf
49d73c9fb6 Fortran: check arguments of MASKL/MASKR intrinsics before simplification
gcc/fortran/ChangeLog:

	PR fortran/103777
	* simplify.c (gfc_simplify_maskr): Check validity of argument 'I'
	before simplifying.
	(gfc_simplify_maskl): Likewise.

gcc/testsuite/ChangeLog:

	PR fortran/103777
	* gfortran.dg/masklr_3.f90: New test.
2022-01-09 22:18:11 +01:00
Harald Anlauf
2e63128306 Fortran: reject invalid non-constant pointer initialization targets
gcc/fortran/ChangeLog:

	PR fortran/101762
	* expr.c (gfc_check_pointer_assign): For pointer initialization
	targets, check that subscripts and substring indices in
	specifications are constant expressions.

gcc/testsuite/ChangeLog:

	PR fortran/101762
	* gfortran.dg/pr101762.f90: New test.
2022-01-09 22:08:14 +01:00
Mikael Morin
c1c17a43e1 Fortran: Ignore KIND argument of a few more intrinsics. [PR103789]
After PR97896 for which some code was added to ignore the KIND argument
of the INDEX intrinsics, and PR87711 for which that was extended to LEN_TRIM
as well, this propagates it further to MASKL, MASKR, SCAN and VERIFY.

	PR fortran/103789

gcc/fortran/ChangeLog:

	* trans-array.c (arg_evaluated_for_scalarization): Add MASKL, MASKR,
	SCAN and VERIFY to the list of intrinsics whose KIND argument is to be
	ignored.

gcc/testsuite/ChangeLog:

	* gfortran.dg/maskl_1.f90: New test.
	* gfortran.dg/maskr_1.f90: New test.
	* gfortran.dg/scan_3.f90: New test.
	* gfortran.dg/verify_3.f90: New test.
2022-01-09 14:34:15 +01:00
Sandra Loosemore
57fe1f6ad3 Testsuite: Make dependence on -fdelete-null-pointer-checks explicit
nios2-elf target defaults to -fno-delete-null-pointer-checks, breaking
tests that implicitly depend on that optimization.  Add the option
explicitly on these tests.

2022-01-08  Sandra Loosemore  <sandra@codesourcery.com>

	gcc/testsuite/
	* g++.dg/cpp0x/constexpr-compare1.C: Add explicit
	-fdelete-null-pointer-checks option.
	* g++.dg/cpp0x/constexpr-compare2.C: Likewise.
	* g++.dg/cpp0x/constexpr-typeid2.C: Likewise.
	* g++.dg/cpp1y/constexpr-94716.C: Likewise.
	* g++.dg/cpp1z/constexpr-compare1.C: Likewise.
	* g++.dg/cpp1z/constexpr-if36.C: Likewise.
	* gcc.dg/init-compare-1.c: Likewise.

	libstdc++-v3/
	* testsuite/18_support/type_info/constexpr.cc: Add explicit
	-fdelete-null-pointer-checks option.
2022-01-08 22:17:18 -08:00
GCC Administrator
2848ef1411 Daily bump. 2022-01-09 00:16:20 +00:00
Roger Sayle
fad14a028f x86_64: Improve (interunit) moves from TImode to V1TImode.
This patch improves the code generated when moving a 128-bit value
in TImode, represented by two 64-bit registers, to V1TImode, which
is a single SSE register.

Currently, the simple move:
typedef unsigned __int128 uv1ti __attribute__ ((__vector_size__ (16)));
uv1ti foo(__int128 x) { return (uv1ti)x; }

is always transferred via memory, as:
foo:    movq    %rdi, -24(%rsp)
        movq    %rsi, -16(%rsp)
        movdqa  -24(%rsp), %xmm0
        ret

with this patch, we now generate (with -msse2):
foo:    movq    %rdi, %xmm1
        movq    %rsi, %xmm2
        punpcklqdq      %xmm2, %xmm1
        movdqa  %xmm1, %xmm0
        ret

and with -mavx2:
foo:    vmovq   %rdi, %xmm1
        vpinsrq $1, %rsi, %xmm1, %xmm0
        ret

Even more dramatic is the improvement of zero extended transfers.

uv1ti bar(unsigned char c) { return (uv1ti)(__int128)c; }

Previously generated:
bar:    movq    $0, -16(%rsp)
        movzbl  %dil, %eax
        movq    %rax, -24(%rsp)
        vmovdqa -24(%rsp), %xmm0
        ret

Now generates:
bar:    movzbl  %dil, %edi
        movq    %rdi, %xmm0
        ret

My first attempt at this functionality attempted to use a simple
define_split, but unfortunately, this triggers very late during the
compilation preventing some of the simplifications we'd like (in
combine).  For example the foo case above becomes:

foo:    movq    %rsi, -16(%rsp)
        movq    %rdi, %xmm0
        movhps  -16(%rsp), %xmm0

transferring half directly, and the other half via memory.
And for the bar case above, GCC fails to appreciate that
movq/vmovq clears the high bits, resulting in:

bar:    movzbl  %dil, %eax
        xorl    %edx, %edx
        vmovq   %rax, %xmm1
        vpinsrq $1, %rdx, %xmm1, %xmm0
        ret

Hence the solution (i.e. this patch) is to add a special case
to ix86_expand_vector_move for TImode to V1TImode transfers.

2022-01-08  Roger Sayle  <roger@nextmovesoftware.com>

gcc/ChangeLog
	* config/i386/i386-expand.c (ix86_expand_vector_move): Add
	special case for TImode to V1TImode moves, going via V2DImode.

gcc/testsuite/ChangeLog
	* gcc.target/i386/sse2-v1ti-mov-1.c: New test case.
	* gcc.target/i386/sse2-v1ti-zext.c: New test case.
2022-01-08 12:27:50 +00:00
Jakub Jelinek
51d464b608 c++, match.pd: Evaluate in constant evaluation comparisons like &var1 + 12 == &var2 + 24 [PR89074]
The match.pd address_comparison simplification can only handle
ADDR_EXPR comparisons possibly converted to some other type (I wonder
if we shouldn't restrict it in address_compare to casts to pointer
types or pointer-sized integer types, I think we shouldn't optimize
(short) (&var) == (short) (&var2) because we really don't know whether
it will be true or false).  On GIMPLE, most of pointer to pointer
casts are useless and optimized away and further we have in
gimple_fold_stmt_to_constant_1 an optimization that folds
&something p+ const_int
into
&MEM_REF[..., off]
On GENERIC, we don't do that and e.g. for constant evaluation it
could be pretty harmful if e.g. such pointers are dereferenced, because
it can lose what exact field it was starting with etc., all it knows
is the base and offset, type and alias set.
Instead of teaching the match.pd address_compare about 3 extra variants
where one or both compared operands are pointer_plus, this patch attempts
to fold operands of comparisons similarly to gimple_fold_stmt_to_constant_1
before calling fold_binary on it.
There is another thing though, while we do have (x p+ y) p+ z to
x p+ (y + z) simplification which works on GIMPLE well because of the
useless pointer conversions, on GENERIC we can have pointer casts in between
and at that point we can end up with large expressions like
((type3) (((type2) ((type1) (&var + 2) + 2) + 2) + 2))
etc.  Pointer-plus doesn't really care what exact pointer type it has as
long as it is a pointer, so the following match.pd simplification for
GENERIC only (it is useless for GIMPLE) also moves the cast so that nested
p+ can be simplified.

Note, I've noticed we don't really diagnose going out of bounds with
pointer_plus (unlike e.g. with ARRAY_REF) during constant evaluation, I
think another patch for cxx_eval_binary_expression with POINTER_PLUS will be
needed.  But it isn't clear to me what exactly it should do in case of
subobjects.  If we start with address of a whole var, (&var), I guess we
should diagnose if the pointer_plus gets before start of the var (i.e.
"negative") or 1 byte past the end of the var, but what if we start with
&var.field or &var.field[3] ?  For &var.field, shall we diagnose out of
bounds of field (except perhaps flexible members?) or the whole var?
For ARRAY_REFs, I assume we must at least strip all the outer ARRAY_REFs
and so start with &var.field too, right?

2022-01-08  Jakub Jelinek  <jakub@redhat.com>

	PR c++/89074
gcc/
	* match.pd ((ptr) (x p+ y) p+ z -> (ptr) (x p+ (y + z))): New GENERIC
	simplification.
gcc/cp/
	* constexpr.c (cxx_maybe_fold_addr_pointer_plus): New function.
	(cxx_eval_binary_expression): Use it.
gcc/testsuite/
	* g++.dg/cpp1y/constexpr-89074-2.C: New test.
	* g++.dg/cpp1z/constexpr-89074-1.C: New test.
2022-01-08 09:53:00 +01:00
Jason Merrill
787d66eb6c c++: default mem-init of array [PR103946]
In the patch for PR92385 I added asserts to see if we tried to make a
vec_init of a vec_init, but didn't see any in regression testing.  This
testcase is one case, which seems reasonable: we create a VEC_INIT_EXPR for
the aggregate initializer, and then again to express the actual
initialization of the member.  We already do similar collapsing of
TARGET_EXPR.  So let's just remove the asserts.

	PR c++/103946

gcc/cp/ChangeLog:

	* init.c (build_vec_init): Remove assert.
	* tree.c (build_vec_init_expr): Likewise.

gcc/testsuite/ChangeLog:

	* g++.dg/cpp0x/nsdmi-array1.C: New test.
2022-01-08 01:18:51 -05:00
Jason Merrill
75047f7951 c++: destroying delete, throw in new-expr [PR100588]
The standard says that a destroying operator delete is preferred, but that
only applies to the delete-expression, not the cleanup if a new-expression
initialization throws.  As a result of this patch, several of the destroying
delete tests don't get EH cleanups, but I'm turning off the warning in cases
where the initialization can't throw anyway.

It's unclear what should happen if the class does not declare a non-deleting
operator delete; a proposal in CWG was to call the global delete, which
makes sense to me if the class doesn't declare its own operator new.  If it
does, we warn and don't call any deallocation function if initialization
throws.

	PR c++/100588

gcc/cp/ChangeLog:

	* call.c (build_op_delete_call): Ignore destroying delete
	if alloc_fn.

gcc/testsuite/ChangeLog:

	* g++.dg/cpp2a/destroying-delete5.C: Expect warning.
	* g++.dg/cpp2a/destroying-delete6.C: New test.
2022-01-07 21:03:28 -05:00
GCC Administrator
55e96bf912 Daily bump. 2022-01-08 00:16:27 +00:00
David Malcolm
11a2ff8d98 analyzer: add logging of aliasing
gcc/analyzer/ChangeLog:
	* engine.cc (impl_run_checkers): Pass logger to engine ctor.
	* region-model-manager.cc
	(region_model_manager::region_model_manager): Add logger param and
	use it to initialize m_logger.
	* region-model.cc (engine::engine): New.
	* region-model.h (region_model_manager::region_model_manager):
	Add logger param.
	(region_model_manager::get_logger): New.
	(region_model_manager::m_logger): New field.
	(engine::engine): New.
	* store.cc (store_manager::get_logger): New.
	(store::set_value): Log scope.  Log when marking a cluster as
	unknown due to possible aliasing.
	* store.h (store_manager::get_logger): New decl.
2022-01-07 19:05:16 -05:00