A branch with a name matching scan-assembler pattern triggers
inappropriate FAIL.
E.g. branch fixups-testsuite and
- gcc.target/i386/pr65871-?.c (scan-assembler-not "test")
- gcc.target/i386/pr41442.c (scan-assembler-times "test|cmp" 2)
etc.
This is a recurring problem as can be seen by some -fno-ident additions
by commits from e.g. Michael Meissner over the years: builtins-58.c,
powerpc/pr46728-?.c
The patch below adds -fno-ident if a testcase contains one of
scan-assembler, scan-assembler-not or scan-assembler-times.
Regression tested on x86_64-unknown-linux on a fixups-testsuite branch
where it fixes several false FAILs without regressions.
gcc/testsuite/ChangeLog
2016-06-18 Bernhard Reutner-Fischer <aldot@gcc.gnu.org>
PR testsuite/52665
* lib/gcc-dg.exp (gcc-dg-test-1): Iterate over _required_options.
* lib/target-supports.exp (scan-assembler_required_options,
scan-assembler-not_required_options,
scan-assembler-times_required_options): Add -fno-ident.
* lib/scanasm.exp (scan-assembler-times): Fix error message.
* c-c++-common/ident-0a.c: New test.
* c-c++-common/ident-0b.c: New test.
* c-c++-common/ident-1a.c: New test.
* c-c++-common/ident-1b.c: New test.
* c-c++-common/ident-2a.c: New test.
* c-c++-common/ident-2b.c: New test.
From-SVN: r264128
This patch aims to optimise sequences involving uses of 1.0 / sqrt (a) under -freciprocal-math and -funsafe-math-optimizations.
In particular consider:
x = 1.0 / sqrt (a);
r1 = x * x; // same as 1.0 / a
r2 = a * x; // same as sqrt (a)
If x, r1 and r2 are all used further on in the code, this can be transformed into:
tmp1 = 1.0 / a
tmp2 = sqrt (a)
tmp3 = tmp1 * tmp2
x = tmp3
r1 = tmp1
r2 = tmp2
A bit convoluted, but this saves us one multiplication and, more importantly, the sqrt and division are now independent.
This also allows optimisation of a subset of these expressions.
For example:
x = 1.0 / sqrt (a)
r1 = x * x
can be transformed to r1 = 1.0 / a, eliminating the sqrt if x is not used anywhere else.
And similarly:
x = 1.0 / sqrt (a)
r1 = a * x
can be transformed to sqrt (a) eliminating the division.
For the testcase:
double res, res2, tmp;
void
foo (double a, double b)
{
tmp = 1.0 / __builtin_sqrt (a);
res = tmp * tmp;
res2 = a * tmp;
}
We now generate for aarch64 with -Ofast:
foo:
fmov d2, 1.0e+0
adrp x2, res2
fsqrt d1, d0
adrp x1, res
fdiv d0, d2, d0
adrp x0, tmp
str d1, [x2, #:lo12:res2]
fmul d1, d1, d0
str d0, [x1, #:lo12:res]
str d1, [x0, #:lo12:tmp]
ret
where before it generated:
foo:
fsqrt d2, d0
fmov d1, 1.0e+0
adrp x1, res2
adrp x2, tmp
adrp x0, res
fdiv d1, d1, d2
fmul d0, d1, d0
fmul d2, d1, d1
str d1, [x2, #:lo12:tmp]
str d0, [x1, #:lo12:res2]
str d2, [x0, #:lo12:res]
ret
As you can see, the new sequence has one fewer multiply and the fsqrt and fdiv are independent.
* tree-ssa-math-opts.c (is_mult_by): New function.
(is_square_of): Use the above.
(optimize_recip_sqrt): New function.
(pass_cse_reciprocals::execute): Use the above.
* gcc.dg/recip_sqrt_mult_1.c: New test.
* gcc.dg/recip_sqrt_mult_2.c: Likewise.
* gcc.dg/recip_sqrt_mult_3.c: Likewise.
* gcc.dg/recip_sqrt_mult_4.c: Likewise.
* gcc.dg/recip_sqrt_mult_5.c: Likewise.
* g++.dg/recip_sqrt_mult_1.C: Likewise.
* g++.dg/recip_sqrt_mult_2.C: Likewise.
From-SVN: r264126
2018-09-05 Richard Biener <rguenther@suse.de>
PR bootstrap/87134
* tree-ssa-sccvn.c (rpo_elim::eliminate_push_avail): Make sure
to zero-init the emplaced vec.
From-SVN: r264125
2018-09-05 Martin Liska <mliska@suse.cz>
PR tree-optimization/87205
* tree-switch-conversion.c (pass_lower_switch::execute):
Group cases for switch statements.
2018-09-05 Martin Liska <mliska@suse.cz>
PR tree-optimization/87205
* gcc.dg/tree-ssa/pr87205-2.c: New test.
* gcc.dg/tree-ssa/pr87205.c: New test.
From-SVN: r264124
https://gcc.gnu.org/ml/gcc-patches/2018-08/msg01966.html
PR c++/87137
* stor-layout.c (place_field): Scan forwards to check last
bitfield when ms_bitfield_placement is in effect.
gcc/testsuite/
* g++.dg/abi/pr87137.C: New.
From-SVN: r264119
This is a rewrite of the tag collision avoidance patch that Kugan had
written as a machine reorg pass back in February.
The falkor hardware prefetching system uses a combination of the
source, destination and offset to decide which prefetcher unit to
train with the load. This is great when loads in a loop are
sequential but sub-optimal if there are unrelated loads in a loop that
tag to the same prefetcher unit.
This pass attempts to rename the desination register of such colliding
loads using routines available in regrename.c so that their tags do
not collide. This shows some performance gains with mcf and xalancbmk
(~5% each) and will be tweaked further. The pass is placed near the
fag end of the pass list so that subsequent passes don't inadvertantly
end up undoing the renames.
2018-07-02 Siddhesh Poyarekar <siddhesh@sourceware.org>
Kugan Vivekanandarajah <kugan.vivekanandarajah@linaro.org>
* config/aarch64/falkor-tag-collision-avoidance.c: New file.
* config.gcc (extra_objs): Build it.
* config/aarch64/t-aarch64 (falkor-tag-collision-avoidance.o):
Likewise.
* config/aarch64/aarch64-passes.def
(pass_tag_collision_avoidance): New pass.
* config/aarch64/aarch64.c (qdf24xx_tunings): Add
AARCH64_EXTRA_TUNE_RENAME_LOAD_REGS to tuning_flags.
(aarch64_classify_address): Remove static qualifier.
(aarch64_address_info, aarch64_address_type): Move to...
* config/aarch64/aarch64-protos.h: ... here.
(make_pass_tag_collision_avoidance): New function.
* config/aarch64/aarch64-tuning-flags.def (rename_load_regs):
New tuning flag.
Co-Authored-By: Kugan Vivekanandarajah <kuganv@linaro.org>
From-SVN: r264115
2018-09-05 Martin Liska <mliska@suse.cz>
* doc/gcov.texi: Update documentation of humar
readable mode.
* gcov.c (format_count): Print one decimal place, it provides
more fine number of situations like '1G' vs. '1.4G'.
2018-09-05 Martin Liska <mliska@suse.cz>
* g++.dg/gcov/loop.C: Update test to support new format.
From-SVN: r264112
2018-09-05 Martin Liska <mliska@suse.cz>
PR target/87164
* config/rs6000/rs6000.opt: Mark the option as Deprecated.
* optc-gen.awk: Allow 'Var' for Deprecated options in order
to generate a MASK value.
From-SVN: r264111
r251028
commit cd557ff63f388ad27c376d0a225e74d3594a6f9d
Author: hjl <hjl@138bc75d-0d04-0410-961f-82ee72b054a4>
Date: Thu Aug 10 15:29:05 2017 +0000
i386: Don't use frame pointer without stack access
When there is no stack access, there is no need to use frame pointer
even if -fno-omit-frame-pointer is used and caller's frame pointer is
unchanged.
frame pointer may not be available even if -fno-omit-frame-pointer is
used. When this happened, arg pointer may be eliminated by hard frame
pointer. Since hard frame pointer is encoded with DW_OP_fbreg which
uses the DW_AT_frame_base attribute, not hard frame pointer directly,
we should allow hard frame pointer when generating DWARF info even if
frame pointer isn't used.
gcc/
PR debug/86593
* dwarf2out.c (based_loc_descr): Allow hard frame pointer even
if frame pointer isn't used.
(compute_frame_pointer_to_fb_displacement): Likewise.
gcc/testsuite/
PR debug/86593
* g++.dg/pr86593.C: New test.
From-SVN: r264096
NAND is ~(a1 & a2), but xtensa_expand_atomic does ~a1 & a2.
That fixes libatomic tests atomic-op-{1,2}.
gcc/
2018-09-04 Max Filippov <jcmvbkbc@gmail.com>
* config/xtensa/xtensa.c (xtensa_expand_atomic): Reorder AND and
XOR operations in NAND case.
From-SVN: r264087
* wide-int-range.cc (wide_int_range_convert): New.
* wide-int-range.h (wide_int_range_convert): New.
* tree-vrp.c (extract_range_from_unary_expr): Abstract wide int
code into wide_int_range_convert.
(extract_range_into_wide_ints): Do not munge anti range constants
into the entire domain. Just return the range back.
From-SVN: r264085
2018-09-04 Richard Biener <rguenther@suse.de>
PR tree-optimization/87211
* tree-ssa-sccvn.c (visit_phi): When value-numbering to a
backedge value we're supposed to treat as VARYING also number
the PHI to VARYING in case it got a different value-number already.
* gcc.dg/torture/pr87211.c: New testcase.
From-SVN: r264079
2018-09-04 Richard Biener <rguenther@suse.de>
PR tree-optimization/87176
* tree-ssa-sccvn.c (visit_phi): Remove redundant allsame
variable. When value-numbering a virtual PHI node make sure
to not value-number to the backedge value.
* gcc.dg/torture/pr87176.c: New testcase.
* gcc.dg/torture/ssa-fre-1.c: Likewise.
From-SVN: r264077
2018-09-03 Jerry DeLisle <jvdelisle@gcc.gnu.org>
* simplify.c (gfc_simplify_modulo): Re-arrange code to test whether
'P' is zero and issue an error if it is.
* gfortran.dg/modulo_check: New test.
From-SVN: r264070
* sort.cc (struct sort_ctx): New field 'nlim'. Use it...
(mergesort): ... here as maximum count for using netsort.
(gcc_qsort): Set nlim to 3 if stable sort is requested.
(gcc_stablesort): New.
* system.h (gcc_stablesort): Declare.
From-SVN: r264066
Our md files refer to {l,st}xsd%U<n>x, but no {l,st}xsdux insns exist.
This patch removes the update forms. All these use constraint "Z"
which does not allow update form, so there is no practical difference.
* config/rs6000/rs6000.md (*mov<mode>_hardfloat32): Remove %U from the
lxsdx and stxsdx alternatives.
(*mov<mode>_hardfloat64): Ditto.
* config/rs6000/vsx.md (*vsx_extract_<mode>_store): Ditto.
From-SVN: r264064
Split the long double testing into a separate file, so that we can XFAIL
targets where the long double precision doesn't meet the expected
tolerances. The float and double tests are still expefted to PASS for
all targets.
PR libstdc++/78179
* testsuite/26_numerics/headers/cmath/hypot-long-double.cc: New test
that runs the long double part of hypot.cc.
* testsuite/26_numerics/headers/cmath/hypot.cc: Disable long double
tests unless TEST_HYPOT_LONG_DOUBLE is defined.
From-SVN: r264063
The pointer argument to allocator_traits::construct and
allocator_traits::destroy should be a raw pointer, not the allocator's
pointer type. _Temporary_value::_M_ptr was returning the wrong type.
* include/bits/stl_vector.h (vector::_Temporary_value::_M_ptr):
Return raw pointer not allocator's pointer type.
(vector::_Temporary_value::_M_val): Use _M_ptr.
From-SVN: r264061
Since C++11 range insertion and construction of maps and sets from a
pair of iterators only requires that the iterator's value_type is
convertible to the container's value_type (previously it had to be the
same).
This fixes the implementation to meet that relaxed requirement, by
defining a pair of overloads that either insert or emplace, depending on
the iterator's value_type. Instead of adding yet another overload of
_M_insert_unique and _M_insert_equal, the overloads taking iterators are
renamed to _M_insert_range_unique and _M_insert_range_equal.
PR libstdc++/87194
* include/bits/stl_map.h
(map::map(initializer_list<value_type>, const Compare&, const Alloc&))
(map::map(initializer_list<value_type>, const Alloc&))
(map::map(InputIterator, InputIterator, const Alloc&))
(map::map(InputIterator, InputIterator))
(map::map(InputIterator, InputIterator, const Compare&, const Alloc&))
(map::insert(InputIterator, InputIterator)):
Call _M_insert_range_unique instead of _M_insert_unique.
* include/bits/stl_multimap.h
(multimap::multimap(initializer_list<value_type>, const C&, const A&))
(multimap::multimap(initializer_list<value_type>, const A&))
(multimap::multimap(InputIterator, InputIterator, const A&))
(multimap::multimap(InputIterator, InputIterator))
(multimap::multimap(InputIterator, InputIterator, const C&, const A&))
(multimap::insert(InputIterator, InputIterator)): Call
_M_insert_range_equal instead of _M_insert_equal.
* include/bits/stl_multiset.h
(multiset::multiset(InputIterator, InputIterator))
(multiset::multiset(InputIterator, InputIterator, const C&, const A&))
(multiset::multiset(initializer_list<value_type>, const C&, const A&))
(multiset::multiset(initializer_list<value_type>, const A&))
(multiset::multiset(InputIterator, InputIterator, const A&))
(multiset::insert(InputIterator, InputIterator)): Call
_M_insert_range_equal instead of _M_insert_equal.
* include/bits/stl_set.h
(set::set(InputIterator, InputIterator))
(set::set(InputIterator, InputIterator, const Compare&, const Alloc&))
(set::set(initializer_list<value_type>, const Compare&, const Alloc&))
(set::set(initializer_list<value_type>, const Alloc&))
(set::set(InputIterator, InputIterator, const Alloc&))
(set::insert(InputIterator, InputIterator)):
Call _M_insert_range_unique instead of _M_insert_unique.
* include/bits/stl_tree.h
[__cplusplus >= 201103L] (_Rb_tree::__same_value_type): New alias
template for SFINAE constraints.
[__cplusplus >= 201103L] (_Rb_tree::_M_insert_range_unique): Pair of
constrained overloads that either insert or emplace, depending on
iterator's value_type.
[__cplusplus >= 201103L] (_Rb_tree::_M_insert_range_equal): Likewise.
[__cplusplus < 201103L] (_Rb_tree::_M_insert_range_unique)
(_Rb_tree::_M_insert_range_equal): New functions replacing range
versions of _M_insert_unique and _M_insert_equal.
(_Rb_tree::_M_insert_unique(_InputIterator, _InputIterator))
(_Rb_tree::_M_insert_equal(_InputIterator, _InputIterator)): Remove.
* testsuite/23_containers/map/modifiers/insert/87194.cc: New test.
* testsuite/23_containers/multimap/modifiers/insert/87194.cc: New test.
* testsuite/23_containers/multiset/modifiers/insert/87194.cc: New test.
* testsuite/23_containers/set/modifiers/insert/87194.cc: New test.
From-SVN: r264060
C++14 simplified the specification of the generic insert function
templates to be equivalent to calling emplace (or emplace_hint).
Defining them in terms of emplace takes care of the problems described
in PR 78595, ensuring a single conversion to value_type is done at the
right time.
PR libstdc++/78595
* include/bits/stl_map.h (map::insert(_Pair&&))
(map::insert(const_iterator, _Pair&&)): Do emplace instead of insert.
* include/bits/stl_multimap.h (multimap::insert(_Pair&&))
(multimap::insert(const_iterator, _Pair&&)): Likewise.
* include/bits/unordered_map.h (unordered_map::insert(_Pair&&))
(unordered_map::insert(const_iterator, _Pair&&))
(unordered_multimap::insert(_Pair&&))
(unordered_multimap::insert(const_iterator, _Pair&&)): Likewise.
* testsuite/23_containers/map/modifiers/insert/78595.cc: New test.
* testsuite/23_containers/multimap/modifiers/insert/78595.cc: New test.
* testsuite/23_containers/unordered_map/modifiers/78595.cc: New test.
* testsuite/23_containers/unordered_multimap/modifiers/78595.cc: New
test.
From-SVN: r264059
2018-09-03 Richard Biener <rguenther@suse.de>
PR tree-optimization/87197
* tree-ssa-sccvn.c (vn_nary_build_or_lookup_1): Mark the new def
visited. CSE the VN_INFO hashtable lookup.
* gcc.dg/torture/pr87197.c: New testcase.
PR tree-optimization/87169
* tree-ssa-sccvn.c (do_rpo_vn): When marking loops for not
iterating make sure there's no extra backedges from irreducible
regions feeding the header. Mark the destination block
executable.
* gcc.dg/torture/pr87169.c: New testcase.
From-SVN: r264057
The rationale for the fixinclude ioctl macro wrapper is, as far as I can
tell (https://gcc.gnu.org/ml/gcc-patches/2012-09/msg01619.html)
Fix 2: Add hack for ioctl() on VxWorks.
ioctl() is supposed to be variadic, but VxWorks only has a three
argument version with the third argument of type int. This messes up
when the third argument is not implicitly convertible to int. This
adds a macro which wraps around ioctl() and explicitly casts the third
argument to an int. This way, the most common use case of ioctl (with
a const char * for the third argument) will compile in C++, where
pointers must be explicitly casted to int.
However, we have existing C++ code that calls the ioctl function via
::ioctl(foo, bar, baz)
and obviously this breaks when it gets expanded to
::(ioctl)(foo, bar, (int)(baz))
Since the GNU C preprocessor already prevents recursive expansion of
function-like macros, the parentheses around ioctl are unnecessary.
Incidentally, there is also a macro sioIoctl() in the vxworks sioLib.h
header that expands to
((pSioChan)->pDrvFuncs->ioctl (pSioChan, cmd, arg))
which also breaks when that gets further expanded to
((pSioChan)->pDrvFuncs->(ioctl) (pSioChan, cmd, (int)(arg)))
This patch partly fixes that issue as well, but the third argument to
the pDrvFuncs->ioctl method should be void*, so the cast to (int) is
slightly annoying. Internally, we've simply patched the sioIoctl macro:
(((pSioChan)->pDrvFuncs->ioctl) (pSioChan, cmd, arg))
From-SVN: r264056
2018-09-03 Martin Liska <mliska@suse.cz>
PR driver/83193
* common/common-target.def: Add TARGET_GET_VALID_OPTION_VALUES.
* common/common-targhooks.c (default_get_valid_option_values):
New function.
* common/common-targhooks.h (default_get_valid_option_values):
Likewise.
* common/config/i386/i386-common.c: Move processor_target_table
from i386.c.
(ix86_get_valid_option_values): New function.
(TARGET_GET_VALID_OPTION_VALUES): New macro.
* config/i386/i386.c (struct ptt): Move to i386-common.c.
(PTA_*): Move all defined masks into i386-common.c.
(ix86_function_specific_restore): Use new processor_cost_table.
* config/i386/i386.h (struct ptt): Moved from i386.c.
(struct pta): Likewise.
* doc/tm.texi: Document new TARGET_GET_VALID_OPTION_VALUES.
* doc/tm.texi.in: Likewise.
* opt-suggestions.c (option_proposer::suggest_option):
Pass prefix to build_option_suggestions.
(option_proposer::get_completions): Likewise.
(option_proposer::build_option_suggestions): Use the new target
hook.
* opts.c (struct option_help_tuple): New struct.
(print_filtered_help): Use the new target hook.
2018-09-03 Martin Liska <mliska@suse.cz>
PR driver/83193
* gcc.dg/completion-4.c: New test.
From-SVN: r264052
/cp
2018-09-03 Paolo Carlini <paolo.carlini@oracle.com>
PR c++/84980
* constraint.cc (finish_shorthand_constraint): Early return if the
constraint is erroneous.
/testsuite
2018-09-03 Paolo Carlini <paolo.carlini@oracle.com>
PR c++/84980
* g++.dg/concepts/pr84980.C: New.
From-SVN: r264051
2018-09-03 Martin Liska <mliska@suse.cz>
PR middle-end/59521
* predict.c (set_even_probabilities): Add likely_edges
argument and handle cases where we have precisely one
likely edge.
(combine_predictions_for_bb): Catch also likely_edges.
(tree_predict_by_opcode): Handle gswitch statements.
* tree-cfg.h (find_case_label_for_value): New declaration.
(find_taken_edge_switch_expr): Likewise.
* tree-switch-conversion.c (switch_decision_tree::balance_case_nodes):
Find pivot in decision tree based on probabily, not by number of
nodes.
2018-09-03 Martin Liska <mliska@suse.cz>
PR middle-end/59521
* c-c++-common/pr59521-1.c: New test.
* c-c++-common/pr59521-2.c: New test.
* gcc.dg/tree-prof/pr59521-3.c: New test.
From-SVN: r264050