This patch will add the missed pattern described in bug 103514 [1] to the match.pd. [1] includes proof of correctness for the patch too.
1) https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103514
gcc/
PR tree-optimization/103514
* match.pd (a & b) ^ (a == b) -> !(a | b): New optimization.
(a & b) == (a ^ b) -> !(a | b): New optimization.
gcc/testsuite
* gcc.dg/tree-ssa/pr103514.c: Testcase for this optimization.
Here we're emitting a -Wignored-qualifiers warning for an intermediate
compiler-generated cast of nullptr to 'method-type* const' as part of
value initialization of a const pmf. This patch suppresses the warning
by instead casting to the corresponding unqualified type.
PR c++/92752
gcc/cp/ChangeLog:
* typeck.cc (build_ptrmemfunc): Cast a nullptr constant to the
unqualified pointer type not the qualified one.
gcc/testsuite/ChangeLog:
* g++.dg/warn/Wignored-qualifiers2.C: New test.
Co-authored-by: Jason Merrill <jason@redhat.com>
A recent patch added tests for OPTION_GLIBC that is defined in
linux.h and linux64.h. This broke bootstrap for powerpc Darwin.
Fixed by adding a definition to 0 for OPTION_GLIBC.
Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
gcc/ChangeLog:
* config/rs6000/darwin.h (OPTION_GLIBC): Define to 0.
This patch implements an optimization for the following C++ code:
int f(int x) {
return 1 / x;
}
int f(unsigned int x) {
return 1 / x;
}
Before this patch, x86-64 gcc -std=c++20 -O3 produces the following assembly:
f(int):
xor edx, edx
mov eax, 1
idiv edi
ret
f(unsigned int):
xor edx, edx
mov eax, 1
div edi
ret
In comparison, clang++ -std=c++20 -O3 produces the following assembly:
f(int):
lea ecx, [rdi + 1]
xor eax, eax
cmp ecx, 3
cmovb eax, edi
ret
f(unsigned int):
xor eax, eax
cmp edi, 1
sete al
ret
Clang's output is more efficient as it avoids expensive div operations.
With this patch, GCC now produces the following assembly:
f(int):
lea eax, [rdi + 1]
cmp eax, 2
mov eax, 0
cmovbe eax, edi
ret
f(unsigned int):
xor eax, eax
cmp edi, 1
sete al
ret
which is virtually identical to Clang's assembly output. Any slight differences
in the output for f(int) is possibly related to a different missed optimization.
v2: https://gcc.gnu.org/pipermail/gcc-patches/2022-January/587751.html
Changes from v2:
1. Refactor from using a switch statement to using the built-in
if-else statement.
v1: https://gcc.gnu.org/pipermail/gcc-patches/2022-January/587634.html
Changes from v1:
1. Refactor common if conditions.
2. Use build_[minus_]one_cst (type) to get -1/1 of the correct type.
3. Match only for TRUNC_DIV_EXPR and TYPE_PRECISION (type) > 1.
gcc/ChangeLog:
PR tree-optimization/95424
* match.pd: Simplify 1 / X where X is an integer.
As mentioned in the PRthe following testcase fails, because the last
stmt of a bb with -g is a debug stmt and get_status_for_store_merging
uses gimple_seq_last_stmt (bb_seq (bb)) when testing if it is valid
for store merging. The debug stmt isn't valid, while a stmt at that
position with -g0 is valid and so the divergence.
As we walk the whole bb already, this patch just remembers the last
non-debug stmt, so that we don't need to skip backwards debug stmts at the
end of the bb to find last real stmt.
2022-01-28 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/104263
* gimple-ssa-store-merging.cc (get_status_for_store_merging): For
cfun->can_throw_non_call_exceptions && cfun->eh test whether
last non-debug stmt in the bb is store_valid_for_store_merging_p
rather than last stmt.
* gcc.dg/pr104263.c: New test.
Revert partially what I did in g:76ef38e3178a11e76a66b4d4c0e10e85fe186a45.
gcc/ChangeLog:
* diagnostic.cc (diagnostic_action_after_output): Remove extra
newline.
gcc/ChangeLog:
* config/rs6000/host-darwin.cc (segv_crash_handler):
Do not use leading capital letter.
(segv_handler): Likewise.
* ipa-sra.cc (verify_splitting_accesses): Likewise.
* varasm.cc (get_section): Likewise.
gcc/d/ChangeLog:
* decl.cc (d_finish_decl): Do not use leading capital letter.
When deducing the type of a variable template (or templated static data
member) with a constrained auto type, we might need its template
arguments for satisfaction since the constraint could depend on them.
PR c++/103341
gcc/cp/ChangeLog:
* decl.cc (cp_finish_decl): Pass the template arguments of a
variable template specialization or a templated static data
member to do_auto_deduction when the auto is constrained.
gcc/testsuite/ChangeLog:
* g++.dg/cpp2a/concepts-class4.C: New test.
* g++.dg/cpp2a/concepts-var-templ2.C: New test.
The following fixes the vector type registered for external defs
in call arguments when vectorizing with SLP. We assumed uniform
vectype_in types here but with calls like .COND_MUL we also have
mask arguments which, when invariant or external, need to have
a proper mask vector type.
2022-01-28 Richard Biener <rguenther@suse.de>
PR tree-optimization/104267
* tree-vect-stmts.cc (vectorizable_call): Properly use the
per-argument determined vector type for externals and
invariants.
This removes a premature optimization from
gimple_purge_dead_abnormal_call_edges which, after eliding the
last setjmp (or computed goto) statement from a function and
thus clearing cfun->calls_setjmp, leaves us with the abnormal
edges from other calls that are elided for example via inlining
or DCE. That's a CFG / IL combination that should be impossible
(not addressing the fact that with cfun->calls_setjmp and
cfun->has_nonlocal_label cleared we should not have any abnormal
edge at all).
For the testcase in the PR this means that IPA inlining will
remove the abormal edges from the block after inlining the call
the edge was coming from.
2022-01-28 Richard Biener <rguenther@suse.de>
PR tree-optimization/104263
* tree-cfg.cc (gimple_purge_dead_abnormal_call_edges):
Purge edges also when !cfun->has_nonlocal_label
and !cfun->calls_setjmp.
* gcc.dg/tree-ssa/inline-13.c: New testcase.
Document new `auipc' and `bitmanip' `type' attributes added respectively
with commit 88108b27dd ("RISC-V: Add sifive-7 pipeline description.")
and commit 283b1707f2 ("RISC-V: Implement instruction patterns for ZBA
extension.") but not listed so far.
gcc/
* config/riscv/riscv.md: Document `auipc' and `bitmanip' `type'
attributes.
The testcase in the PR (not included for the testsuite because we don't
have an (easy) way to -fcompare-debug LTO, we'd need 2 compilations/linking,
one with -g and one with -g0 and -fdump-rtl-final= at the end of lto1
and compare that) has different code generation for -g vs. -g0.
The difference appears during expansion, where we have a goto_locus
that is at -O0 compared to the INSN_LOCATION of the previous and next insn
across an edge. With -g0 the locations are equal and so no nop is added.
With -g the locations aren't equal and so a nop is added holding that
location.
The reason for the different location is in the way how we stream in
locations by lto1.
We have lto_location_cache::apply_location_cache that is called with some
set of expanded locations, qsorts them, creates location_t's for those
and remembers the last expanded location.
lto_location_cache::input_location_and_block when read in expanded_location
is equal to the last expanded location just reuses the last location_t
(or adds/changes/removes LOCATION_BLOCK in it), when it is not queues
it for next apply_location_cache. Now, when streaming in -g input, we can
see extra locations that don't appear with -g0, and if we are unlucky
enough, those can be sorted last during apply_location_cache and affect
what locations are used from the single entry cache next.
In particular, second apply_location_cache with non-empty loc_cache in
the testcase has 14 locations with -g0 and 16 with -g and those 2 extra
ones sort both last (they are the same). The last one from -g0 then
appears to be input_location_and_block sourced again, for -g0 triggers
the single entry cache, while for -g it doesn't and so apply_location_cache
will create for it another location_t with the same content.
The following patch fixes it by comparing everything we care about the
location instead (well, better in addition) to a simple location_t ==
location_t check. I think we don't care about the sysp flag for debug
info...
2022-01-28 Jakub Jelinek <jakub@redhat.com>
PR lto/104237
* cfgrtl.cc (loc_equal): New function.
(unique_locus_on_edge_between_p): Use it.
The following makes dumping of a function as graph work as intended
when specifying a function other than cfun. Unfortunately the loop
and the dominance APIs are not set up to work for other functions
than cfun so you won't get any fancy loop dumps but the non-loop
dump works up to reaching mark_dfs_back_edges which I trivially made
function aware and adjusted current callers with a wrapper.
With all this, doing dot-fn id->src_cfun from the debugger when
debugging inlining works. Previously you got a strange mix of
the src and dest functions visualized ;)
2022-01-28 Richard Biener <rguenther@suse.de>
* cfganal.h (mark_dfs_back_edges): Provide API with struct
function argument.
* cfganal.cc (mark_dfs_back_edges): Take a struct function
to work on, add a wrapper passing cfun.
* graph.cc (draw_cfg_nodes_no_loops): Replace stray cfun
uses with fun which is already passed.
(draw_cfg_edges): Likewise.
(draw_cfg_nodes_for_loop): Do not use draw_cfg_nodes_for_loop
for fun != cfun.
This is a regression present on mainline and 11 branch: the transformation
applied during expansion by Narrow_Large_Operation would incorrectly perform
name resolution for the operator again.
gcc/ada/
PR ada/104258
* exp_ch4.adb (Narrow_Large_Operation): Also copy the entity, if
any, when rewriting the operator node.
gcc/testsuite/
* gnat.dg/generic_comp.adb: New test.
The GCC 8 lambda overhaul fixed most uses of lambdas in pack expansions, but
local enums and classes within such lambdas that depend on parameter packs
are still broken. For now, give a sorry instead of an ICE or incorrect
error.
PR c++/100198
PR c++/100030
PR c++/100282
gcc/cp/ChangeLog:
* parser.cc (cp_parser_enumerator_definition): Sorry on parameter
pack in lambda.
(cp_parser_class_head): And in class attributes.
* pt.cc (check_for_bare_parameter_packs): Sorry instead of error
in lambda.
gcc/testsuite/ChangeLog:
* g++.dg/cpp0x/lambda/lambda-variadic13.C: Accept the sorry
as well as the correct error.
* g++.dg/cpp0x/lambda/lambda-variadic14.C: Likewise.
* g++.dg/cpp0x/lambda/lambda-variadic14a.C: New test.
* g++.dg/cpp0x/lambda/lambda-variadic15.C: New test.
* g++.dg/cpp0x/lambda/lambda-variadic16.C: New test.
The compiler warns about the loop in deque::_M_range_initialize because
it doesn't know that the number of nodes has already been correctly
sized to match the size of the input. Use __builtin_unreachable to tell
it that the loop will never be entered if the number of elements is
smaller than a single node.
libstdc++-v3/ChangeLog:
PR libstdc++/100516
* include/bits/deque.tcc (_M_range_initialize<ForwardIterator>):
Add __builtin_unreachable to loop.
* testsuite/23_containers/deque/100516.cc: New test.
When reviewing the output of -fanalyzer on PR analyzer/104224 I noticed
that despite very verbose paths, the diagnostic paths for
-Wanalyzer-use-of-uninitialized-value
don't show where the uninitialized memory is allocated.
This patch adapts and simplifies material from
"[PATCH 3/6] analyzer: implement infoleak detection"
https://gcc.gnu.org/pipermail/gcc-patches/2021-November/584377.html
in order to add region creation events for the pertinent region (whether
on the stack or heap).
For example, this patch extends:
malloc-1.c: In function 'test_40':
malloc-1.c:461:5: warning: use of uninitialized value '*p' [CWE-457] [-Wanalyzer-use-of-uninitialized-value]
461 | i = *p;
| ~~^~~~
'test_40': event 1
|
| 461 | i = *p;
| | ~~^~~~
| | |
| | (1) use of uninitialized value '*p' here
|
to:
malloc-1.c: In function 'test_40':
malloc-1.c:461:5: warning: use of uninitialized value '*p' [CWE-457] [-Wanalyzer-use-of-uninitialized-value]
461 | i = *p;
| ~~^~~~
'test_40': events 1-2
|
| 460 | int *p = (int*)malloc(sizeof(int*));
| | ^~~~~~~~~~~~~~~~~~~~
| | |
| | (1) region created on heap here
| 461 | i = *p;
| | ~~~~~~
| | |
| | (2) use of uninitialized value '*p' here
|
and this helps readability of the resulting warnings, especially in
more complicated cases.
gcc/analyzer/ChangeLog:
* checker-path.cc (event_kind_to_string): Handle
EK_REGION_CREATION.
(region_creation_event::region_creation_event): New.
(region_creation_event::get_desc): New.
(checker_path::add_region_creation_event): New.
* checker-path.h (enum event_kind): Add EK_REGION_CREATION.
(class region_creation_event): New subclass.
(checker_path::add_region_creation_event): New decl.
* diagnostic-manager.cc
(diagnostic_manager::emit_saved_diagnostic): Pass NULL for new
param to add_events_for_eedge when handling trailing eedge.
(diagnostic_manager::build_emission_path): Create an interesting_t
instance, allow the pending diagnostic to populate it, and pass it
to the calls to add_events_for_eedge.
(diagnostic_manager::add_events_for_eedge): Add "interest" param.
Use it to add region_creation_events for on-stack regions created
within at function entry, and when pertinent dynamically-sized
regions are created.
(diagnostic_manager::prune_for_sm_diagnostic): Add case for
EK_REGION_CREATION.
* diagnostic-manager.h (diagnostic_manager::add_events_for_eedge):
Add "interest" param.
* pending-diagnostic.cc: Include "selftest.h", "tristate.h",
"analyzer/call-string.h", "analyzer/program-point.h",
"analyzer/store.h", and "analyzer/region-model.h".
(interesting_t::add_region_creation): New.
(interesting_t::dump_to_pp): New.
* pending-diagnostic.h (struct interesting_t): New.
(pending_diagnostic::mark_interesting_stuff): New vfunc.
* region-model.cc
(poisoned_value_diagnostic::poisoned_value_diagnostic): Add
(poisoned_value_diagnostic::operator==): Compare m_pkind and
m_src_region fields.
(poisoned_value_diagnostic::mark_interesting_stuff): New.
(poisoned_value_diagnostic::m_src_region): New.
(region_model::check_for_poison): Call
get_region_for_poisoned_expr for uninit values and pass the resul
to the diagnostic.
(region_model::get_region_for_poisoned_expr): New.
(region_model::deref_rvalue): Pass NULL for
poisoned_value_diagnostic's src_region.
* region-model.h (region_model::get_region_for_poisoned_expr): New
decl.
* region.h (frame_region::get_fndecl): New.
gcc/testsuite/ChangeLog:
* gcc.dg/analyzer/data-model-1.c: Add dg-message directives for
expected region creation events.
* gcc.dg/analyzer/malloc-1.c: Likewise.
* gcc.dg/analyzer/memset-CVE-2017-18549-1.c: Likewise.
* gcc.dg/analyzer/pr101547.c: Likewise.
* gcc.dg/analyzer/pr101875.c: Likewise.
* gcc.dg/analyzer/pr101962.c: Likewise.
* gcc.dg/analyzer/pr104224.c: Likewise.
* gcc.dg/analyzer/pr94047.c: Likewise.
* gcc.dg/analyzer/symbolic-1.c: Likewise.
* gcc.dg/analyzer/uninit-1.c: Likewise.
* gcc.dg/analyzer/uninit-4.c: Likewise.
* gcc.dg/analyzer/uninit-alloca.c: New test.
* gcc.dg/analyzer/uninit-pr94713.c: Add dg-message directive for
expected region creation event.
* gcc.dg/analyzer/uninit-pr94714.c: Likewise.
* gcc.dg/analyzer/zlib-3.c: Likewise.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
When (bound - i) or n is the most negative value of its type, the
negative of the value will overflow. Instead of abs(n) >= abs(bound - i)
use n >= (bound - i) when positive and n <= (bound - i) when negative.
The function has a precondition that they must have the same sign, so
this works correctly. The precondition check can be moved into the else
branch, and simplified.
The standard requires calling ranges::advance(i, bound) even if i==bound
is already true, which is technically observable, but that's pointless.
We can just return n in that case. Similarly, for i!=bound but n==0 we
are supposed to call ranges::advance(i, n), but that's pointless. An LWG
issue to allow omitting the pointless calls is expected to be filed.
libstdc++-v3/ChangeLog:
* include/bits/ranges_base.h (ranges::advance): Avoid signed
overflow. Do nothing if already equal to desired result.
* testsuite/24_iterators/range_operations/advance_overflow.cc:
New test.
A flaw in my patch for PR51344 was that cplus_decl_attributes calls
decl_attributes after save_template_attributes, which messes up the ordering
that save_template_attributes set up. Fixed by splitting
save_template_attributes around the call to decl_attributes.
PR c++/104245
PR c++/51344
gcc/cp/ChangeLog:
* decl2.cc (save_template_attributes): Take late attrs as parm.
(cplus_decl_attributes): Call it after decl_attributes,
splice_template_attributes before.
gcc/testsuite/ChangeLog:
* g++.dg/lto/alignas1_0.C: New test.
As stated in signaling_?.f90 tests, x86-32 ABI is not suitable to
correctly handle signaling NaNs. However, XFAIL is not the correct choice
to disable these tests, since various optimizations can generate code
that avoids moves from registers to memory (and back), resulting
in the code that executes correctly, producing spurious XFAIL.
These tests should be disabled on x86-32 using { ! ia32 } dg-directive
which rules out x32 ilp32 ABI, where tests execute without problems.
Please note that check_effective_target_ia32 test tries to compile code that
uses __i386__ target-dependent preprocessor definition, so it is guaranteed
to fail on all non-ia32 targets.
2022-01-27 Uroš Bizjak <ubizjak@gmail.com>
gcc/testsuite/ChangeLog:
* gfortran.dg/ieee/signaling_1.f90 (dg-do):
Run only on non-ia32 targets.
* gfortran.dg/ieee/signaling_2.f90 (dg-do): Ditto.
* gfortran.dg/ieee/signaling_3.f90 (dg-do): Ditto.
gcc/fortran/ChangeLog:
PR fortran/104128
* expr.cc (gfc_copy_expr): Convert internal representation of
string to wide char in value only for default character kind.
* target-memory.cc (interpret_array): Pass flag for conversion of
wide chars.
(gfc_target_interpret_expr): Likewise.
gcc/testsuite/ChangeLog:
PR fortran/104128
* gfortran.dg/transfer_simplify_14.f90: New test.
contrib/ChangeLog:
* git-descr.sh: New file.
* git-undescr.sh: New file.
Support optional arguments --long, --short and default
to 14 characters of git hash.
* gcc-git-customization.sh: Use the created files.
Co-Authored-By: Martin Jambor <mjambor@suse.cz>
Here we're emitting a bogus error during ahead of time evaluation of a
non-dependent immediate member function call such as a.f(args) because
the defacto templated form for such a call is (a.f)(args) but we're
trying to evaluate it using the intermediate CALL_EXPR built by
build_over_call, which has the non-member form f(a, args). The defacto
member form is built in build_new_method_call, so it seems we should
handle the immediate call there instead, or perhaps make build_over_call
build the correct form in the first place.
Giiven that there are many spots other than build_new_method_call that
call build_over_call for member functions, e.g. build_op_call, this
patch takes the latter approach.
In passing, this patch makes us avoid wrapping PARM_DECL in
NON_DEPENDENT_EXPR for benefit of the third testcase below.
PR c++/99895
gcc/cp/ChangeLog:
* call.cc (build_over_call): For a non-dependent member call,
build up a CALL_EXPR using a COMPONENT_REF callee, as in
build_new_method_call.
* pt.cc (build_non_dependent_expr): Don't wrap PARM_DECL either.
* tree.cc (build_min_non_dep_op_overload): Adjust accordingly
after the build_over_call change.
gcc/ChangeLog:
* tree.cc (build_call_vec): Add const to second parameter.
* tree.h (build_call_vec): Likewise.
gcc/testsuite/ChangeLog:
* g++.dg/cpp2a/consteval-memfn1.C: New test.
* g++.dg/cpp2a/consteval-memfn2.C: New test.
* g++.dg/cpp2a/consteval28.C: New test.
In the nested_name_specifier branch within cp_parser_class_head, we need
to update 'type' with the result of maybe_process_partial_specialization
like we do in the template_id_p branch.
PR c++/92944
PR c++/103678
gcc/cp/ChangeLog:
* parser.cc (cp_parser_class_head): Update 'type' with the result
of maybe_process_partial_specialization in the
nested_name_specifier branch. Refactor nearby code to accomodate
that maybe_process_partial_specialization returns a _TYPE, not a
TYPE_DECL, and eliminate local variable 'class_type' in passing.
gcc/testsuite/ChangeLog:
* g++.dg/cpp2a/concepts-partial-spec10.C: New test.
* g++.dg/cpp2a/concepts-partial-spec11.C: New test.
In r12-1933 I attempted to implement DR2397 aka allowing
int a[3];
auto (&r)[3] = a;
by removing the type_uses_auto check in create_array_type_for_decl.
That may have gone too far, because it also allows arrays of
CLASS_PLACEHOLDER_TEMPLATE and it looks like [dcl.type.class.deduct]
prohibits that: "...the declared type of the variable shall be cv T,
where T is the placeholder." However, in /2 it explicitly states that
"A placeholder for a deduced class type can also be used in the
type-specifier-seq in the new-type-id or type-id of a new-expression."
In this PR, it manifested by making us accept invalid
template<class T> struct A { A(T); };
auto p = new A[]{1};
[expr.new]/2 says that such a construct is treated as an invented
declaration of the form
A x[]{1};
but, I think, that ought to be ill-formed as per above. So this patch
sort of restores the create_array_type_for_decl check. I should mention
that the difference between [] and [1] is due to cp_parser_new_type_id:
if (*nelts == NULL_TREE)
/* Leave [] in the declarator. */;
and groktypename returning different types based on that.
PR c++/101988
gcc/cp/ChangeLog:
* decl.cc (create_array_type_for_decl): Reject forming an array of
placeholder for a deduced class type.
gcc/testsuite/ChangeLog:
* g++.dg/cpp1z/class-deduction-new1.C: New test.
* g++.dg/cpp23/auto-array2.C: New test.
PR web/104254
gcc/ChangeLog:
* diagnostic.cc (diagnostic_initialize):
Initialize report_bug flag.
(diagnostic_action_after_output):
Explain that -freport-bug option can be used for pre-processed
file creation. Make the message shorter.
(error_recursion): Rename Internal to internal.
* diagnostic.h (struct diagnostic_context): New field.
* opts.cc (common_handle_option): Init the field here.
PR analyzer/104247
gcc/analyzer/ChangeLog:
* constraint-manager.cc (bounded_ranges_manager::log_stats):
Cast to long for format purpose.
* region-model-manager.cc (log_uniq_map): Likewise.
This patch is to fix one wrong assertion which is too aggressive.
Vectorizer can do vec_construct costing for the vector type which
only has one unit. For the failed case, the passed in vector type
is "vector(1) int", though it doesn't end up with any construction
eventually, we have to handle this kind of possibility.
gcc/ChangeLog:
PR target/103702
* config/rs6000/rs6000.cc
(rs6000_cost_data::update_target_cost_per_stmt): Fix one wrong
assertion with early return.
gcc/testsuite/ChangeLog:
PR target/103702
* gcc.target/powerpc/pr103702.c: New test.
This issue was triggered after the patch extending syntax for component access
in map clauses in commit 0ab29cf0bb.
In gimplify_scan_omp_clauses, the case for handling indirect accesses (which
creates firstprivate ptr and zero-length array section map for such decls) was
erroneously went into for non-pointer cases (here being the base struct decl),
so added the
appropriate checks there.
Added new testcase is a compile only test for the ICE. The original omptests
t-partial-struct test actually should not execute correctly, because for
map(t.s->a[:N]), map(t.s[:1]) is not implicitly mapped, thus the entire
offloaded access does not work as is (fixing that omptests test is out of
scope here).
2022-01-27 Chung-Lin Tang <cltang@codesourcery.com>
PR middle-end/103642
gcc/ChangeLog:
* gimplify.cc (gimplify_scan_omp_clauses): Do not do indir_p handling
for non-pointer or non-reference-to-pointer cases.
gcc/testsuite/ChangeLog:
* c-c++-common/gomp/pr103642.c: New test.
After the quoting changes in r12-6521-g03a1a86b5ee40d4e240, branch-protection-attr.c
fails due to expecting a different quoting type for "leaf".
This patch changes the quoting from "" to '' as that is what is used now.
Committed as obvious after a test of the testcase.
gcc/testsuite/ChangeLog:
PR target/104201
* gcc.target/aarch64/branch-protection-attr.c: Fix quoting for
the expected error message on line 5 of leaf.
As mentioned in the PR, reassoc1 miscompiles following testcase.
We have:
if (a.1_1 >= 0)
goto <bb 5>; [59.00%]
else
goto <bb 4>; [41.00%]
<bb 4> [local count: 440234144]:
_3 = -2147483647 - a.1_1;
_9 = a.1_1 != -2147479551;
_4 = _3 == 1;
_8 = _4 | _9;
if (_8 != 0)
goto <bb 5>; [34.51%]
else
goto <bb 3>; [65.49%]
and the inter-bb range test optimization treats it as:
if ((a.1_1 >= 0)
| (-2147483647 - a.1_1 == 1)
| (a.1_1 != -2147479551))
goto bb5;
else
goto bb3;
and recognizes that a.1_1 >= 0 is redundant with a.1_1 != -2147479551
and so will optimize it into:
if (0
| (-2147483647 - a.1_1 == 1)
| (a.1_1 != -2147479551))
goto bb5;
else
goto bb3;
When merging 2 comparisons, we use update_range_test which picks one
of the comparisons as the one holding the result (currently always
the RANGE one rather than all the OTHERRANGE* ones) and adjusts the
others to be true or false.
The problem with doing that is that means the
_3 = -2147483647 - a.1_1;
stmt with undefined behavior on overflow used to be conditional before
but now is unconditional. reassoc performs a no_side_effect_bb check
which among other checks punts on gimple_has_side_effects and
gimple_assign_rhs_could_trap_p stmts as well as ones that have uses of
their lhs outside of the same bb, but it doesn't punt for this potential
signed overflow case.
Now, for this testcase, it can be fixed in update_range_test by being
smarter and choosing the other comparison to modify. This is achieved
by storing into oe->id index of the bb with GIMPLE_COND the
comparison feeds into not just for the cases where the comparison is
the GIMPLE_COND itself, but in all cases, and then finding oe->id that
isn't dominated by others. If we find such, use that oe for the merge
test and if successful, swap range->idx and swap_with->idx.
So for the above case we optimize it into:
if ((a.1_1 != -2147479551)
| (-2147483647 - a.1_1 == 1)
| 0)
goto bb5;
else
goto bb3;
instead.
Unfortunately, this doesn't work in all cases,
optimize_range_tests_to_bit_test and
optimize_range_tests_cmp_bitwise optimizations use non-NULL seq
to update_range_test and they carefully choose a particular comparison
because the sequence then uses SSA_NAMEs that may be defined only in
their blocks. For that case, the patch keeps using the chosen comparison
but if the merge is successful, rewrites stmts with signed overflow behavior
into defined overflow.
For this I ran into a problem, rewrite_to_defined_overflow returns a
sequence that includes the original stmt with modified arguments, this means
it needs to be gsi_remove first. Unfortunately, gsi_remove regardless of
the remove_permanently argument creates debug temps for the lhs, which I
think is quite undesirable here. So I've added an argument (default to
false) to rewrite_to_defined_overflow to do the modification in place
without the need to remove the stmt.
2022-01-27 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/104196
* gimple-fold.h (rewrite_to_defined_overflow): Add IN_PLACE argument.
* gimple-fold.cc (rewrite_to_defined_overflow): Likewise. If true,
return NULL and emit needed stmts before and after stmt.
* tree-ssa-reassoc.cc (update_range_test): For inter-bb range opt
pick as operand_entry that will hold the merged test the one feeding
earliest condition, ensure that by swapping range->idx with some
other range's idx if needed. If seq is non-NULL, don't actually swap
it but instead rewrite stmts with undefined overflow in between
the two locations.
(maybe_optimize_range_tests): Set ops[]->id to bb->index with the
corresponding condition even if they have non-NULL ops[]->op.
Formatting fix.
* gcc.c-torture/execute/pr104196.c: New test.
When writing testcases for the previously posted patch, I have noticed
that 3 of the headers aren't valid C89 (I didn't have any dg-options
so -ansi -pedantic-errors was implied and these errors were reported).
The following patch fixes those, ok for trunk?
Note, as can be seen even in this patch, seems older rs6000/*intrin.h
headers uglify not just argument names (__A instead of A etc.), but also
automatic variable names and other local identifiers, while e.g. emmintrin.h
or bmi2intrin.h clearly uglify only the argument names and not local
variables. I think that should be fixed but don't have time for that myself
(libstdc++ or e.g. the x86 headers uglify everything; this is so that one
can
#define result a + b
#include <x86intrin.h>
etc.).
2022-01-26 Jakub Jelinek <jakub@redhat.com>
PR target/104239
* config/rs6000/emmintrin.h (_mm_sad_epu8): Use __asm__ instead of
asm.
* config/rs6000/smmintrin.h (_mm_minpos_epu16): Declare iterator
before for loop instead of for init clause.
* config/rs6000/bmi2intrin.h (_pext_u64): Likewise.
* gcc.target/powerpc/pr104239-3.c: New test.
r12-4717-g7d37abedf58d66 added immintrin.h and x86gprintrin.h headers
to rs6000, these headers are on x86 standalone headers that various
programs include directly rather than including them through
<x86intrin.h>.
Unfortunately, for that change the bmiintrin.h and bmi2intrin.h
headers haven't been adjusted, so the effect is that if one includes them
(without including also x86intrin.h first) #error will trigger.
Furthermore, when including such headers conditionally as some real-world
packages do, this means a regression.
The following patch fixes it and matches what the x86 bmi{,2}intrin.h
headers do.
2022-01-26 Jakub Jelinek <jakub@redhat.com>
PR target/104239
* config/rs6000/bmiintrin.h: Test _X86GPRINTRIN_H_INCLUDED instead of
_X86INTRIN_H_INCLUDED and adjust #error wording.
* config/rs6000/bmi2intrin.h: Likewise.
* gcc.target/powerpc/pr104239-1.c: New test.
* gcc.target/powerpc/pr104239-2.c: New test.
My patch for PR101072 removed the specific VECTOR_TYPE handling here, which
broke pr72747-2.c on PPC; this patch restores it.
PR c++/104206
PR c++/101072
gcc/cp/ChangeLog:
* semantics.cc (finish_compound_literal): Restore VECTOR_TYPE check.
On Mon, Jan 24, 2022 at 11:26:27PM +0100, Jakub Jelinek via Gcc-patches wrote:
> Yet another short term solution might be not use DW_TAG_base_type
> for the IEEE quad long double, but instead pretend it is a DW_TAG_typedef
> with DW_AT_name "long double" to __float128 DW_TAG_base_type.
> I bet gdb would even handle it without any changes, but of course, it would
> be larger than the other proposed changes.
Here it is implemented.
Testcases I've played with are e.g.:
__ibm128 a;
long double b;
_Complex long double c;
static __attribute__((noinline)) int
foo (long double d)
{
long double e = d + 1.0L;
return 0;
}
int
main ()
{
a = 1.0;
b = 2.0;
c = 5.0 + 6.0i;
return foo (7.0L);
}
and
real(kind=16) :: a
complex(kind=16) :: b
a = 1.0
b = 2.0
end
Printing the values of the variables works well,
p &b or p &c shows pointer to the correct type, just
ptype b or ptype c prints _Float128 instead of
long double or complex _Float128 instead of complex long double.
Even worse in fortran where obviously _Float128 or
complex _Float128 aren't valid types, but as GDB knows them by name,
it is just ptype that is weird.
2022-01-26 Jakub Jelinek <jakub@redhat.com>
PR debug/104194
* dwarf2out.cc (long_double_as_float128): New function.
(modified_type_die): For powerpc64le IEEE 754 quad long double
and complex long double emit those as DW_TAG_typedef to
_Float128 or complex _Float128 base type.
The middle-end uses sometimes VECTOR_TYPE CONSTRUCTORs that contain
some other VECTOR_TYPE elements in it (should be with compatible element
size and smaller number of elements, e.g. a V8SImode vector can be
constructed as { V4SImode_var_1, V4SImode_var_2 }), and expansion of
__builtin_shufflevector emits these early, so constexpr.cc can see those
too.
constexpr.cc already has special cases for NULL index which is typical
for VECTOR_TYPE CONSTRUCTORs, and for VECTOR_TYPE CONSTRUCTORs that
contain just scalar elts that works just fine - init_subob_ctx just
returns on non-aggregate elts and get_or_insert_ctor_field has
if (TREE_CODE (type) == VECTOR_TYPE && index == NULL_TREE)
{
CONSTRUCTOR_APPEND_ELT (CONSTRUCTOR_ELTS (ctor), index, NULL_TREE);
return &CONSTRUCTOR_ELTS (ctor)->last();
}
handling for it. But for the vector in vector case init_subob_ctx would
try to create a sub-CONSTRUCTOR and even didn't handle the NULL index
case well, so instead of creating the sub-CONSTRUCTOR after the elts already
in it overwrote the first one. So
(V8SImode) { { 0, 0, 0, 0 }, { 0, 0, 0, 0 } }
became
(V8SImode) { 0, 0, 0, 0 }
The following patch fixes it by not forcing a sub-CONSTRUCTOR for this
vector in vector case.
2022-01-26 Jakub Jelinek <jakub@redhat.com>
PR c++/104226
* constexpr.cc (init_subob_ctx): For vector ctors containing
vector elements, ensure appending to the same ctor instead of
creating another one.
* g++.dg/cpp0x/constexpr-104226.C: New test.