Testing has shown that using the load vector pair and store vector pair
instructions for block moves has some performance issues on power10.
A patch on June 11th modified the code so that GCC would not set
-mblock-ops-vector-pair by default if we are tuning for power10, but it would
set the option if we were tuning for a different machine and have load and store
vector pair instructions enabled.
This patch eliminates the code setting -mblock-ops-vector-pair. If you want to
generate load vector pair and store vector pair instructions for block moves,
you must use -mblock-ops-vector-pair.
2022-08-08 Michael Meissner <meissner@linux.ibm.com>
gcc/
* config/rs6000/rs6000.c (rs6000_option_override_internal): Remove code
setting -mblock-ops-vector-pair. Patch back ported from trunk, August
3rd, 2022.
This patch updates the POWER testsuite test cases using -mcpu= and -mtune=
to use the preferred -mdejagnu-cpu= and -mdejagnu-tune= options. This also
obviates the need for the dg-skip-if directive, since the user cannot
override the -mcpu= value being used to compile the test case.
2022-03-25 Peter Bergner <bergner@linux.ibm.com>
gcc/testsuite/
* g++.dg/pr65240-1.C: Use -mdejagnu-cpu=. Remove dg-skip-if.
* g++.dg/pr65240-2.C: Likewise.
* g++.dg/pr65240-3.C: Likewise.
* g++.dg/pr65240-4.C: Likewise.
* g++.dg/pr65242.C: Likewise.
* g++.dg/pr67211.C: Likewise.
* g++.dg/pr69667.C: Likewise.
* g++.dg/pr71294.C: Likewise.
* g++.dg/pr84279.C: Likewise.
* g++.dg/torture/ppc-ldst-array.C: Likewise.
* gfortran.dg/nint_p7.f90: Likewise.
* gfortran.dg/pr102860.f90: Likewise.
* gcc.target/powerpc/fusion.c: Use -mdejagnu-cpu= and -mdejagnu-tune=.
* gcc.target/powerpc/fusion2.c: Likewise.
* gcc.target/powerpc/int_128bit-runnable.c: Use -mdejagnu-cpu=.
* gcc.target/powerpc/test_mffsl.c: Likewise.
* gfortran.dg/pr47614.f: Likewise.
* gfortran.dg/pr58968.f: Likewise.
(cherry picked from commit 81faedaa8e)
The handling of #pragma GCC diagnostic uses input_location, which is not always
as precise as needed; in particular the relative location of some tokens and a
_Pragma directive will crucially determine whether a given diagnostic is enabled
or suppressed in the desired way. PR97498 shows how the C frontend ends up with
input_location pointing to the beginning of the line containing a _Pragma()
directive, resulting in the wrong behavior if the diagnostic to be modified
pertains to some tokens found earlier on the same line. This patch fixes that by
addressing two issues:
a) libcpp was not assigning a valid location to the CPP_PRAGMA token
generated by the _Pragma directive.
b) C frontend was not setting input_location to something reasonable.
With this change, the C frontend is able to change input_location to point to
the _Pragma token as needed.
This is just a two-line fix (one for each of a) and b)), the testsuite changes
were needed only because the location on the tested warnings has been somewhat
improved, so the tests need to look for the new locations.
gcc/c/ChangeLog:
PR preprocessor/97498
* c-parser.c (c_parser_pragma): Set input_location to the
location of the pragma, rather than the start of the line.
libcpp/ChangeLog:
PR preprocessor/97498
* directives.c (destringize_and_run): Override the location of
the CPP_PRAGMA token from a _Pragma directive to the location of
the expansion point, as is done for the tokens lexed from it.
gcc/testsuite/ChangeLog:
PR preprocessor/97498
* c-c++-common/pr97498.c: New test.
* gcc.dg/pragma-message.c: Adapt for improved warning locations.
(cherry picked from commit 0587cef3d7)
As PR106345 shows, when configuring compiler with an explicit
option --with-tune=<value>, it would cause some test cases to
fail if their test points are sensitive to tune setting, such
as: group_ending_nop, loop align etc. It doesn't help that
even to specify one explicit -mcpu=.
This patch is to adjust the behavior of -mdejagnu-cpu by
filtering out all -mcpu= and -mtune= options, then test cases
would use <cpu> as tune as the one specified by -mdejagnu-cpu.
2022-07-25 Peter Bergner <bergner@linux.ibm.com>
Kewen Lin <linkw@linux.ibm.com>
PR testsuite/106345
gcc/ChangeLog:
* config/rs6000/rs6000.h (DRIVER_SELF_SPECS): Adjust -mdejagnu-cpu
to filter out all -mtune options.
(cherry picked from commit 75d20d6c84)
As test case in PR106091 shows, rs6000 specific pass swaps
doesn't preserve the reg_note REG_EH_REGION when replacing
some load insn at the end of basic block, it causes the
flow info verification to fail unexpectedly. Since memory
reference rtx may trap, this patch is to ensure we copy
REG_EH_REGION reg_note while replacing swapped aligned load
or store.
PR target/106091
gcc/ChangeLog:
* config/rs6000/rs6000-p8swap.c (replace_swapped_aligned_store): Copy
REG_EH_REGION when replacing one store insn having it.
(replace_swapped_aligned_load): Likewise.
gcc/testsuite/ChangeLog:
* gcc.target/powerpc/pr106091.c: New test.
(cherry picked from commit f428660193)
Remove redundant duplicate backslash characters from \t sequences in the
output pattern of the `stack_protect_set_<mode>' RTL insn.
gcc/
* config/riscv/riscv.md (stack_protect_set_<mode>): Remove
duplicate backslashes.
(cherry picked from commit 3cf07cc5e5)
This patch merges the spin loops in the atomic wait implementation which is a
minor codegen improvement.
libstdc++-v3/ChangeLog:
* include/bits/atomic_wait.h (__atomic_spin): Merge spin loops.
(cherry picked from commit e75da2ace6)
gcc/fortran/ChangeLog:
PR fortran/103504
* interface.c (get_sym_storage_size): Array bounds and character
length can only be of integer type.
gcc/testsuite/ChangeLog:
PR fortran/103504
* gfortran.dg/pr103504.f90: New test.
(cherry picked from commit 600956c81c)
The 11 and 10 partial backports of P2325R3, r11-9555-g92d612cccc1eec and
r10-10808-g22b86cdc4d7fdd, unnecessarily preserved some changes from the
paper that made certain view specializations no longer default
constructible, changes which aren't required to reap the overall benefits
of the paper and which are backward incompatible with pre-P2325R3 code in
practice.
This patch reverts the problematic changes, specifically it relaxes the
constraints on various views' default constructors added by the paper
so that we keep only the constraints that were already implicitly
imposed by the NSDMIs of the view. Thus for example this patch retains
the default_initializable<_Vp> constraint on transform_view's default
constructor since its '_Vp _M_base = _Vp()' NSDMI already requires this
constraint, and it removes the default_initializable<_Fp> constraint
since the corresponding member '__detail::__box<_Fp> _M_fun' doesn't
require default constructibility (specializations of __box are always
default constructible).
After reverting these changes, all static_asserts from p2325.cc that
verify lack of default constructibility now fail as expected, matching
the pre-P2325R3 behavior.
PR libstdc++/106320
libstdc++-v3/ChangeLog:
* include/std/ranges (single_view): Relax constraints on
default constructor so as to preserve pre-P2325R3 behavior.
(filter_view): Likewise.
(transform_view): Likewise.
(take_while_view): Likewise.
(drop_while_view): Likewise.
* testsuite/std/ranges/adaptors/join.cc (test13): New test.
* testsuite/std/ranges/p2325.cc: Fix S to be only non default
constructible and not also non copy constructible. XFAIL the
tests that verify a non default constructible functor makes a
view non default constructible (lines 94, 97 and 98). XFAIL
the test that effectively verifies a non default constructible
element type makes single_view non default constructible (line
114).
The PR97330 fix caused some missed sinking of loads out of loops
the following patch re-instantiates.
2022-05-17 Richard Biener <rguenther@suse.de>
PR tree-optimization/105618
* tree-ssa-sink.c (statement_sink_location): For virtual
PHI uses ignore those defining the used virtual operand.
* gcc.dg/tree-ssa/ssa-sink-19.c: New testcase.
(cherry picked from commit ebce0e9bd8)
This avoids doing aforementioned canoncalization when -ftrapping-math
is in effect and we honor NaNs.
2021-11-15 Richard Biener <rguenther@suse.de>
PR middle-end/103193
* match.pd: Avoid canonicalizing (le/ge @0 @0) to (eq @0 @0)
with NaNs and -ftrapping-math.
(cherry picked from commit d9ca2ca381)
The testcase shows that we can end up with a contiguous access across
loop iterations but by means of permutations the elements accessed
might only cover parts of a vector. In this case we end up with
GROUP_GAP == 0 but still need to avoid accessing excess elements
in the last loop iterations. Peeling for gaps is designed to cover
this but a single scalar iteration might not cover all of the excess
elements. The following ensures peeling for gaps is done in this
situation and when that isn't sufficient because we need to peel
more than one iteration (gcc.dg/vect/pr103116-2.c), fail the SLP
vectorization.
2022-05-04 Richard Biener <rguenther@suse.de>
PR tree-optimization/103116
* tree-vect-stmts.c (get_group_load_store_type): Handle the
case we need peeling for gaps even though GROUP_GAP is zero.
* gcc.dg/vect/pr103116-1.c: New testcase.
* gcc.dg/vect/pr103116-2.c: Likewise.
(cherry picked from commit 52b7b86f8c)
This fixes the following conformance problems reported in the PR:
- Move constructor and move assignment should be defined.
- Copy assignment from a valueless object should be allowed.
Assignment is completely rewritten by this patch, as the previous
version had a number of problems. The converting assignment failed to
handle the case of assigning a new value to a valueless object, which
should work. It only accepted lvalue arguments, so wasn't usable to
implement the move assignment operator. Finally, it enforced the
precondition that the argument is not valueless, which is correct for
the converting assignment but not for the copy assignment.
A new _M_assign member is added to handle all cases of assignment
(copying from an lvalue, moving from an rvalue, and converting from a
different type). The not valueless precondition is checked in the
converting assignment before calling _M_assign, so isn't enforced for
copy and move assignment. The new function no longer uses a switch, so
handles valueless objects as the LHS or RHS of the assignment.
libstdc++-v3/ChangeLog:
PR libstdc++/100823
* include/bits/stl_iterator.h (common_iterator): Define move
constructor and move assignment operator.
(common_iterator::_M_assign): New function implementing
assignment.
(common_iterator::operator=): Use _M_assign.
(common_iterator::_S_valueless): New constant.
* testsuite/24_iterators/common_iterator/100823.cc: New test.
(cherry picked from commit 56c999860b)
The noexcept-specifier for some std::common_iterator constructors was
incorrectly using an rvalue as the first argument of
std::is_nothrow_assignable_v. This gave the wrong answer for some types,
e.g. std::common_iterator<int*, S>, because an rvalue of scalar type
cannot be assigned to.
Also fix the friend declaration to use the same constraints as on the
definition of the class template. G++ fails to diagnose this error, due
to PR c++/96830.
Finally, the copy constructor was using std::move for its argument
in some cases, which should be removed.
libstdc++-v3/ChangeLog:
* include/bits/stl_iterator.h (common_iterator): Fix incorrect
uses of is_nothrow_assignable_v. Fix inconsistent constraints on
friend declaration. Do not move argument in copy constructor.
* testsuite/24_iterators/common_iterator/1.cc: Check for
noexcept constructibnle/assignable.
(cherry picked from commit 3b5567c3ec)
This should have been done as part of the LWG 3574 changes.
libstdc++-v3/ChangeLog:
PR libstdc++/103992
* include/bits/stl_iterator.h (common_iterator): Use
std::construct_at instead of placement new.
* testsuite/24_iterators/common_iterator/1.cc: Check copy
construction is usable in constant expressions.
(cherry picked from commit d67ba1dce9)
This library issue was approved in the October 2021 plenary.
This backport for gcc-11 also includes some [[nodiscard]] attributes
from earlier commits on trunk.
libstdc++-v3/ChangeLog:
* include/bits/stl_iterator.h (common_iterator): Add constexpr
to all member functions (LWG 3574).
* testsuite/24_iterators/common_iterator/1.cc: Evaluate some
tests as constant expressions.
* testsuite/24_iterators/common_iterator/2.cc: Likewise.
(cherry picked from commit 11d3e8f436)
This prevents the test from failing if the only thing not supported is
the text printed to the log about the size of the floating-point type.
libstdc++-v3/ChangeLog:
* testsuite/20_util/from_chars/4.cc: Only use log2 if C99 math
functions are available.
(cherry picked from commit 30aea28bd3)
Although the Filesystem TS isn't properly supported on Windows (unlike
the C++17 Filesystem lib), most tests do pass. Two of the failures are
due to PR 88881 which was only fixed for std::filesystem not the TS.
This applies the fix to the TS implementation too.
libstdc++-v3/ChangeLog:
PR libstdc++/88881
* src/filesystem/ops.cc (has_trailing_slash): New helper
function.
(fs::status): Strip trailing slashes.
(fs::symlink_status): Likewise.
* testsuite/experimental/filesystem/operations/temp_directory_path.cc:
Clean the environment before each test and use TMP instead of
TMPDIR so the test passes on Windows.
(cherry picked from commit 6c96b14a19)
Now non-member functions can be defaulted, so this assert is wrong.
move_signature_fn_p already checks for ctor or op=.
PR c++/106361
gcc/cp/ChangeLog:
* decl.c (move_fn_p): Remove assert.
gcc/testsuite/ChangeLog:
* g++.dg/cpp2a/spaceship-eq14.C: New test.
In check_new_reg_p, the nregs of a du chain is computed by obtaining the
MODE of the first element in the chain, and then calling
hard_regno_nregs() with the MODE. But the first element of the chain can
be a DEBUG_INSN whose mode need not be the same as the rest of the
elements in the du chain. This was resulting in fcompare-debug failure
as check_new_reg_p was returning a different result with -g for the same
candidate register. We can instead obtain nregs from the du chain
itself.
2022-06-10 Surya Kumari Jangala <jskumari@linux.ibm.com>
gcc/
PR rtl-optimization/105041
* regrename.c (check_new_reg_p): Use nregs value from du chain.
gcc/testsuite/
PR rtl-optimization/105041
* gcc.target/powerpc/pr105041.c: New test.
(cherry picked from commit 3e16b4359e)
gcc/fortran/ChangeLog:
PR fortran/104313
* trans-decl.c (gfc_generate_return): Do not generate conflicting
fake results for functions with no result variable under -ff2c.
gcc/testsuite/ChangeLog:
PR fortran/104313
* gfortran.dg/pr104313.f: New test.
(cherry picked from commit 517fb1a781)