When adding the verify_type_context target hook, I'd missed
a site that needs to check an array element type.
gcc/cp/
PR c++/97904
* pt.c (tsubst): Use verify_type_context to check the type
of an array element.
gcc/testsuite/
PR c++/97904
* g++.dg/ext/sve-sizeless-1.C: Add more template tests.
* g++.dg/ext/sve-sizeless-2.C: Likewise.
Generate special double mode sequence also for TImode on 64bit targets.
2020-11-22 Uroš Bizjak <ubizjak@gmail.com>
PR target/97873
gcc/
* config/i386/i386.md (abs<mode>2): Use SWI48DWI mode iterator.
(*abs<dwi>2_doubleword): Use DWIH mode iterator.
(<maxmin:code><mode>3): Use SWI48DWI mode iterator.
(*<maxmin:code><dwi>3_doubleword): Use DWIH mode iterator.
gcc/testsuite/
* gcc.target/i386/pr97873-2.c: New test.
gcc/
* config/h8300/addsub.md: Turn existing patterns into
define_insn_and_split style patterns where the splitter
adds a clobber of the condition code register. Drop "cc"
attribute. Add _clobber_flags patterns to match output of
the splitters.
(add<mod>3_incdec): Remove pattern
(adds/subs splitter): Only run before reload.
* config/h8300/bitfield.md: Turn existing patterns into
define_insn_and_split style patterns where the splitter
adds a clobber of the condition code register. Drop "cc"
attribute. Add _clobber_flags patterns to match output
of the splitters.
(cstoreqi4, cstorehi4, cstoresi4): Comment out
(*bstzhireg, *cmpstz, *bstz, *bistz, *cmpcondset): Likewise
(*condbset, *cmpcondbclr, *condbclr): Likewise.
(*cmpcondbsetreg, *condbsetreg, *cmpcondbclrreg): Likewise.
(*condbclrreg): Likewise.
* config/h8300/combiner.md: Turn existing patterns into
define_insn_and_split style patterns where the splitter
adds a clobber of the condition code register. Drop "cc"
attribute. Add _clobber_flags patterns to match output of
the splitters. Add appropriate CC register clobbers to
existing splitters.
(*addsi3_and_r_1): Disable for now.
(*addsi3_and_not_r_1, bit-test branches): Likewise.
* config/h8300/divmod.md: Turn existing patterns into
define_insn_and_split style patterns where the splitter
adds a clobber of the condition code register. Drop "cc"
attribute. Add _clobber_flags patterns to match output of
the splitters.
* config/h8300/extensions.md: Turn existing patterns into
define_insn_and_split style patterns where the splitter
adds a clobber of the condition code register. Drop "cc"
attribute. Add _clobber_flags patterns to match output of
the splitters.
* config/h8300/genmova.sh: Drop "cc" attribute from patterns.
* config/h8300/mova.md: Drop "cc" attribute from patterns.
* config/h8300/h8300-modes.def: Add CCZN and CCZNV modes.
* config/h8300/h8300-protos.h (output_plussi): Update prototype.
(compute_plussi_length): Likewise.
(h8300_select_cc_mode): Add prototype.
(compute_a_shift_cc): Remove prototype
(cmpute_logical_op_cc): Likewise.
* config/h8300/h8300.c (names_big): Add "cc" register.
(names_extended, names_upper_extended): Likewise.
(h8300_emit_stack_adjustment): Be more selective about setting
RTX_FRAME_RELATED_P.
(h8300_print_operand): Handle CCZN mode
(h8300_select_cc_mode): New function.
(notice_update_cc): if-0 out. Only kept for reference purposes.
(h8300_expand_store): Likewise.
(h8300_binary_length): Handle new insn forms.
(output_plussi): Add argument for NEED_FLAGS and handle that case.
(compute_plussi_length): Likewise.
(compute_logical_op_cc): Return integer.
(TARGET_FLAGS_REGNUM): Define.
* config/h8300/h8300.h (FIRST_PSEUDO_REGISTER): Bump for cc register.
(FIXED_REGISTERS, CALL_USED_REGISTERS): Handle cc register.
(REG_ALLOC_ORDER, REGISTER_NAMES): Likewise.
(SELECT_CC_MODE): Define.
* config/h8300/h8300.md: Add CC_REG.
Do not include peepholes.md for now.
* config/h8300/jumpcall.md (cbranchqi4): Consolidate into
cbranch<mode>4.
(cbranchhi4, cbranchsi4): Likewise.
(cbranch<mode>4): New expander.
(branch): New define_insn_and_split for use before reload.
(branch_1, branch_1_false): New patterns to match splitter output.
Remove code to manage cc_status.flags.
* config/h8300/logical.md: Turn existing patterns into
define_insn_and_split style patterns where the splitter
adds a clobber of the condition code register. Drop "cc"
attribute. Add _clobber_flags patterns to match output of
the splitters. Move various peepholes into this file.
* config/h8300/movepush.md: Turn existing patterns into
define_insn_and_split style patterns where the splitter
adds a clobber of the condition code register. Drop "cc"
attribute. Add _clobber_flags patterns to match output of
the splitters.
* config/h8300/multiply.md: Turn existing patterns into
define_insn_and_split style patterns where the splitter
adds a clobber of the condition code register. Drop "cc"
attribute. Add _clobber_flags patterns to match output of
the splitters.
* config/h8300/other.md: Turn existing patterns into
define_insn_and_split style patterns where the splitter
adds a clobber of the condition code register. Drop "cc"
attribute. Add _clobber_flags patterns to match output of
the splitters.
* config/h8300/peepholes.md: Remove peepholes that were moved
elsewhere.
* config/h8300/predicates.md (simple_memory_operand): New.
* config/h8300/proepi.md: Drop "cc" attribute setting.
* config/h8300/shiftrotate.md: Turn existing patterns into
define_insn_and_split style patterns where the splitter
adds a clobber of the condition code register. Drop "cc"
attribute. Add _clobber_flags patterns to match output of
the splitters.
* config/h8300/testcompare.md: Turn existing patterns into
define_insn_and_split style patterns where the splitter
adds a clobber of the condition code register. Drop "cc"
attribute. Add _clobber_flags patterns to match output of
the splitters. Disable various patterns for now.
Move some peepholes that were previously in peepholes.md here.
When appending a character to an array, the result of that concat
assignment was not the new value of the array, similarly, when appending
an array to another array, side effects were evaluated in reverse to the
expected order of evaluation.
As of this change, the address of the left-hand side expression is
saved and re-used as the result. Its evaluation is now also forced to
occur before the concat operation itself is called.
gcc/d/ChangeLog:
PR d/97889
* expr.cc (ExprVisitor::visit (CatAssignExp *)): Enforce LTR order of
evaluation on left and right hand side expressions.
gcc/testsuite/ChangeLog:
PR d/97889
* gdc.dg/torture/pr97889.d: New test.
The following patch recognizes some further forms of additions with overflow
checks as shown in the testcase, in particular where the unsigned addition is
performed in a wider mode just to catch overflow with a > narrower_utype_max
check.
2020-11-22 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/95853
* tree-ssa-math-opts.c (uaddsub_overflow_check_p): Add maxval
argument, if non-NULL, instead look for r > maxval or r <= maxval
comparisons.
(match_uaddsub_overflow): Pattern recognize even other forms of
__builtin_add_overflow, in particular when addition is performed
in a wider type and result compared to maximum of the narrower
type.
* gcc.dg/pr95853.c: New test.
So I'd forgotten an important tidbit on the H8 port. Specifically
for a branch instruction, the target label must be operand 0 for
the length computations.
This really only affects the main conditional branch pattern.
The other conditional branch patterns are split and ultimately
funnel into the main pattern. This patch fixes the issue by
partially reverting an earlier change. This issue didn't show up
until late in the optimization work on cc0 removal of the H8 port,
but was caught by the testsuite. So there's no new test.
Built and regression tested H8 with this change, with and without
the cc0 removal patches.
gcc/
* config/h8300/jumpcall.md (branch_true, branch_false): Revert
recent change. Ensure operand[0] is always the target label.
We have a similar code pattern in darwin-c.c to one in c-pragmas
(most likely a cut & paste) with a struct type used locally to the
TU. With C++ we need to rename the type to avoid an ODR violation.
gcc/ChangeLog:
* config/darwin-c.c (struct f_align_stack): Rename
to type from align_stack to f_align_stack.
(push_field_alignment): Likewise.
(pop_field_alignment): Likewise.
This patch finishes the second half of -Wrange-loop-construct I promised
to implement: it warns when a loop variable in a range-based for-loop is
initialized with a value of a different type resulting in a copy. For
instance:
int arr[10];
for (const double &x : arr) { ... }
where in every iteration we have to create and destroy a temporary value
of type double, to which we bind the reference. This could negatively
impact performance.
As per Clang, this doesn't warn when the range returns a copy, hence the
glvalue_p check.
gcc/ChangeLog:
PR c++/94695
* doc/invoke.texi: Update the -Wrange-loop-construct description.
gcc/cp/ChangeLog:
PR c++/94695
* parser.c (warn_for_range_copy): Warn when the loop variable is
initialized with a value of a different type resulting in a copy.
gcc/testsuite/ChangeLog:
PR c++/94695
* g++.dg/warn/Wrange-loop-construct2.C: New test.
[dcl.constexpr]/3 says that the function-body of a constexpr function
shall not contain an identifier label, but we aren't enforcing that.
This patch implements that. Of course, we can't reject artificial
labels.
gcc/cp/ChangeLog:
PR c++/97846
* constexpr.c (potential_constant_expression_1): Reject
LABEL_EXPRs that use non-artifical LABEL_DECLs.
gcc/testsuite/ChangeLog:
PR c++/97846
* g++.dg/cpp1y/constexpr-label.C: New test.
This invalid (?) code broke my assumption that if decl_specifiers->type
is null, there must be any type-specifiers. Turn the assert into an if
to fix this crash.
gcc/cp/ChangeLog:
PR c++/97881
* parser.c (warn_about_ambiguous_parse): Only assume "int" if we
actually saw any type-specifiers.
gcc/testsuite/ChangeLog:
PR c++/97881
* g++.dg/warn/Wvexing-parse9.C: New test.
Our implementation of template lambdas incorrectly requires the optional
lambda-declarator. This was probably required by an early draft of
generic lambdas, but now the production is [expr.prim.lambda.general]:
lambda-expression:
lambda-introducer lambda-declarator [opt] compound-statement
lambda-introducer < template-parameter-list > requires-clause [opt]
lambda-declarator [opt] compound-statement
Therefore, we should accept the following test.
gcc/cp/ChangeLog:
PR c++/97839
* parser.c (cp_parser_lambda_declarator_opt): Don't require ().
gcc/testsuite/ChangeLog:
PR c++/97839
* g++.dg/cpp2a/lambda-generic8.C: New test.
When I implemented the code to detect modifying const objects in
constexpr contexts, we couldn't have constexpr destructors, so I didn't
consider them. But now we can and that caused a bogus error in this
testcase: [class.dtor]p5 says that "const and volatile semantics are not
applied on an object under destruction. They stop being in effect when
the destructor for the most derived object starts." so we have to clear
the TREE_READONLY flag we set on the object after the constructors have
been called to mark it as no-longer-under-construction. In the ~Foo
call it's now an object under destruction, so don't report those errors.
gcc/cp/ChangeLog:
PR c++/97427
* constexpr.c (cxx_set_object_constness): New function.
(cxx_eval_call_expression): Set new_obj for destructors too.
Call cxx_set_object_constness to set/unset TREE_READONLY of
the object under construction/destruction.
gcc/testsuite/ChangeLog:
PR c++/97427
* g++.dg/cpp2a/constexpr-dtor10.C: New test.
This fixes some UNRESOLVED tests on (at least) Solaris and Darwin, and
disables some tests that hang forever on Solaris. A proper fix is still
needed.
libstdc++-v3/ChangeLog:
* include/bits/atomic_base.h (atomic_flag::wait): Use correct
type for __atomic_wait call.
* include/bits/atomic_timed_wait.h (__atomic_wait_until): Check
_GLIBCXX_HAVE_LINUX_FUTEX.
* include/bits/atomic_wait.h (__atomic_notify): Likewise.
* include/bits/semaphore_base.h (_GLIBCXX_HAVE_POSIX_SEMAPHORE):
Only define if SEM_VALUE_MAX or _POSIX_SEM_VALUE_MAX is defined.
* testsuite/29_atomics/atomic/wait_notify/bool.cc: Disable on
non-linux targes.
* testsuite/29_atomics/atomic/wait_notify/generic.cc: Likewise.
* testsuite/29_atomics/atomic/wait_notify/pointers.cc: Likewise.
* testsuite/29_atomics/atomic_flag/wait_notify/1.cc: Likewise.
* testsuite/29_atomics/atomic_float/wait_notify.cc: Likewise.
We now determine depnedencies across union fields correctly.
* gcc.dg/vect/vect-35-big-array.c: Excpect 2 loops to be vectorized.
* gcc.dg/vect/vect-35.c: Excpect 2 loops to be vectorized.
* ipa-icf.c (sem_function::equals_wpa): Do not compare ODR type with
-fno-devirtualize.
(sem_item_optimizer::update_hash_by_addr_refs): Hash anonymous ODR
types by TYPE_UID of their main variant.
After the MMA opaque mode patch goes in, we can re-enable
use of vector pair in the inline expansion of memcpy/memmove.
gcc/
* config/rs6000/rs6000.c (rs6000_option_override_internal):
Enable vector pair memcpy/memmove expansion.
This patch changes powerpc MMA builtins to use the new opaque
mode class and use modes OO (32 bytes) and XO (64 bytes)
instead of POI/PXI. Using the opaque modes prevents
optimization from trying to do anything with vector
pair/quad, which was the problem we were seeing with the
partial integer modes.
gcc/
* config/rs6000/mma.md (unspec): Add assemble/extract UNSPECs.
(movoi): Change to movoo.
(*movpoi): Change to *movoo.
(movxi): Change to movxo.
(*movpxi): Change to *movxo.
(mma_assemble_pair): Change to OO mode.
(*mma_assemble_pair): New define_insn_and_split.
(mma_disassemble_pair): New define_expand.
(*mma_disassemble_pair): New define_insn_and_split.
(mma_assemble_acc): Change to XO mode.
(*mma_assemble_acc): Change to XO mode.
(mma_disassemble_acc): New define_expand.
(*mma_disassemble_acc): New define_insn_and_split.
(mma_<acc>): Change to XO mode.
(mma_<vv>): Change to XO mode.
(mma_<avv>): Change to XO mode.
(mma_<pv>): Change to OO mode.
(mma_<apv>): Change to XO/OO mode.
(mma_<vvi4i4i8>): Change to XO mode.
(mma_<avvi4i4i8>): Change to XO mode.
(mma_<vvi4i4i2>): Change to XO mode.
(mma_<avvi4i4i2>): Change to XO mode.
(mma_<vvi4i4>): Change to XO mode.
(mma_<avvi4i4>): Change to XO mode.
(mma_<pvi4i2>): Change to XO/OO mode.
(mma_<apvi4i2>): Change to XO/OO mode.
(mma_<vvi4i4i4>): Change to XO mode.
(mma_<avvi4i4i4>): Change to XO mode.
* config/rs6000/predicates.md (input_operand): Allow opaque.
(mma_disassemble_output_operand): New predicate.
* config/rs6000/rs6000-builtin.def:
Changes to disassemble builtins.
* config/rs6000/rs6000-call.c (rs6000_return_in_memory):
Disallow __vector_pair/__vector_quad as return types.
(rs6000_promote_function_mode): Remove function return type
check because we can't test it here any more.
(rs6000_function_arg): Do not allow __vector_pair/__vector_quad
as as function arguments.
(rs6000_gimple_fold_mma_builtin):
Handle mma_disassemble_* builtins.
(rs6000_init_builtins): Create types for XO/OO modes.
* config/rs6000/rs6000-modes.def: DElete OI, XI,
POI, and PXI modes, and create XO and OO modes.
* config/rs6000/rs6000-string.c (expand_block_move):
Update to OO mode.
* config/rs6000/rs6000.c (rs6000_hard_regno_mode_ok_uncached):
Update for XO/OO modes.
(rs6000_rtx_costs): Make UNSPEC_MMA_XXSETACCZ cost 0.
(rs6000_modes_tieable_p): Update for XO/OO modes.
(rs6000_debug_reg_global): Update for XO/OO modes.
(rs6000_setup_reg_addr_masks): Update for XO/OO modes.
(rs6000_init_hard_regno_mode_ok): Update for XO/OO modes.
(reg_offset_addressing_ok_p): Update for XO/OO modes.
(rs6000_emit_move): Update for XO/OO modes.
(rs6000_preferred_reload_class): Update for XO/OO modes.
(rs6000_split_multireg_move): Update for XO/OO modes.
(rs6000_mangle_type): Update for opaque types.
(rs6000_invalid_conversion): Update for XO/OO modes.
* config/rs6000/rs6000.h (VECTOR_ALIGNMENT_P):
Update for XO/OO modes.
* config/rs6000/rs6000.md (RELOAD): Update for XO/OO modes.
gcc/testsuite/
* gcc.target/powerpc/mma-double-test.c (main): Call abort for failure.
* gcc.target/powerpc/mma-single-test.c (main): Call abort for failure.
* gcc.target/powerpc/pr96506.c: Rename to pr96506-1.c.
* gcc.target/powerpc/pr96506-2.c: New test.
After building some larger codes using opaque types and some c++ codes
using opaque types it became clear I needed to go through and look for
places where opaque types and modes needed to be handled. A whole pile
of one-liners.
gcc/
* typeclass.h: Add opaque_type_class.
* builtins.c (type_to_class): Identify opaque type class.
* dwarf2out.c (is_base_type): Handle opaque types.
(gen_type_die_with_usage): Handle opaque types.
* expr.c (count_type_elements): Opaque types should
never have initializers.
* ipa-devirt.c (odr_types_equivalent_p): No type-specific handling
for opaque types is needed as it eventually checks the underlying
mode which is what is important.
* tree-streamer.c (record_common_node): Handle opaque types.
* tree.c (type_contains_placeholder_1): Handle opaque types.
(type_cache_hasher::equal): No additional comparison needed for
opaque types.
gcc/c-family
* c-pretty-print.c (c_pretty_printer::simple_type_specifier):
Treat opaque types like other types.
(c_pretty_printer::direct_abstract_declarator): Opaque types are
supported types.
gcc/c
* c-aux-info.c (gen_type): Support opaque types.
gcc/cp
* error.c (dump_type): Handle opaque types.
(dump_type_prefix): Handle opaque types.
(dump_type_suffix): Handle opaque types.
(dump_expr): Handle opaque types.
* pt.c (tsubst): Allow opaque types in templates.
(unify): Allow opaque types in templates.
* typeck.c (structural_comptypes): Handle comparison
of opaque types.
On macOS / Darwin, the environ variable can be used directly in the
code of an executable, but cannot be used in the code of a shared
library (i.e. libgfortran.dylib), in this case.
In such cases, the function _NSGetEnviron should be called to get
the address of 'environ'.
libgfortran/ChangeLog:
* intrinsics/execute_command_line.c (environ): Use
_NSGetEnviron to get the environment pointer on Darwin.
Since the test is compiled with -fno-builtin, include math.h to allow for
implementations (like the PowerPC) that have multiple versions of long double
that are selectable by switch. Math.h could possibly switch what function
nextafterl points to.
gcc/testsuite/
2020-11-17 Michael Meissner <meissner@linux.ibm.com>
* gcc.dg/nextafter-2.c: Include math.h.
This patch adds support for mapping the scalar_cmp_exp_qp_* built-in functions
to handle arguments that are either TFmode or KFmode, depending on whether long
double uses the IEEE 128-bit representation (TFmode) or the IBM 128-bit
representation (KFmode). This shows up in the float128-cmp2-runnable.c test
when long double uses the IEEE 128-bit representation.
gcc/
2020-11-20 Michael Meissner <meissner@linux.ibm.com>
* config/rs6000/rs6000-call.c (rs6000_expand_builtin): Add missing
XSCMP* cases for IEEE 128-bit long double.
Here, since we only mention bar<B>, we never emit debug information for it.
But we do emit debug information for H<J>::h, so we need to refer to the
debug info for bar<B>::J even though there is no bar<B>. We deal with this
sort of thing in dwarf2out with the limbo_die_list; parentless dies like J
get attached to the CU at EOF. But here, we were flushing the limbo list,
then generating the template argument DIE for H<J> that refers to J, which
adds J to the limbo list, too late to be flushed. So let's flush a little
later.
gcc/ChangeLog:
PR c++/97918
* dwarf2out.c (dwarf2out_early_finish): flush_limbo_die_list
after gen_scheduled_generic_parms_dies.
gcc/testsuite/ChangeLog:
PR c++/97918
* g++.dg/debug/localclass2.C: New test.
Reduce memory allocation in stable_sort/inplace_merge algorithms to what is needed
by the implementation.
Co-authored-by: John Chang <john.chang@samba.tv>
libstdc++-v3/ChangeLog:
PR libstdc++/83938
* include/bits/stl_tempbuf.h (get_temporary_buffer): Change __len
computation in the loop to avoid truncation.
* include/bits/stl_algo.h:
(__inplace_merge): Take temporary buffer length from smallest range.
(__stable_sort): Limit temporary buffer length.
* testsuite/25_algorithms/inplace_merge/1.cc (test4): New.
* testsuite/performance/25_algorithms/stable_sort.cc: Test stable_sort
under different heap memory conditions.
* testsuite/performance/25_algorithms/inplace_merge.cc: New test.
Check for the presence of _SC_NPROCESSORS_ONLN rather than using a list
of OS-specific macros to decide whether to use `sysconf' like elsewhere
across GCC sources, fixing a compilation error:
adaint.c: In function '__gnat_number_of_cpus':
adaint.c:2398:26: error: '_SC_NPROCESSORS_ONLN' undeclared (first use in this function)
2398 | cores = (int) sysconf (_SC_NPROCESSORS_ONLN);
| ^~~~~~~~~~~~~~~~~~~~
adaint.c:2398:26: note: each undeclared identifier is reported only once for each function it appears in
at least with with VAX/NetBSD 1.6.2.
gcc/ada/
* adaint.c (__gnat_number_of_cpus): Check for the presence of
_SC_NPROCESSORS_ONLN rather than a list of OS-specific macros
to decide whether to use `sysconf'.
Disable USE_PT_GNU_EH_FRAME frame unwinder support for old OS versions,
fixing compilation errors:
.../libgcc/unwind-dw2-fde-dip.c:75:21: error: unknown type name 'Elf_Phdr'
75 | # define ElfW(type) Elf_##type
| ^~~~
.../libgcc/unwind-dw2-fde-dip.c:132:9: note: in expansion of macro 'ElfW'
132 | const ElfW(Phdr) *p_eh_frame_hdr;
| ^~~~
.../libgcc/unwind-dw2-fde-dip.c:75:21: error: unknown type name 'Elf_Phdr'
75 | # define ElfW(type) Elf_##type
| ^~~~
.../libgcc/unwind-dw2-fde-dip.c:133:9: note: in expansion of macro 'ElfW'
133 | const ElfW(Phdr) *p_dynamic;
| ^~~~
.../libgcc/unwind-dw2-fde-dip.c:165:37: warning: 'struct dl_phdr_info' declared inside parameter list will not be visible outside of this definition or declaration
165 | _Unwind_IteratePhdrCallback (struct dl_phdr_info *info, size_t size, void *ptr)
| ^~~~~~~~~~~~
[...]
and producing a working cross-compiler at least with VAX/NetBSD 1.6.2.
libgcc/
* unwind-dw2-fde-dip.c [__OpenBSD__ || __NetBSD__]
(USE_PT_GNU_EH_FRAME): Do not define if !TARGET_DL_ITERATE_PHDR.
Overhaul the mangling scheme to avoid ambiguities if the package path
contains a dot. Instead of using dot both to separate components and
to mangle characters, use dot only to separate components and use
underscore to mangle characters.
For golang/go#41862
Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/271726
Use new template parameters to replace usage of lambdas to move or not
tree values on copy.
libstdc++-v3/ChangeLog:
* include/bits/move.h (_GLIBCXX_FWDREF): New.
* include/bits/stl_tree.h: Adapt to use latter.
(_Rb_tree<>::_M_clone_node): Add _MoveValue template parameter.
(_Rb_tree<>::_M_mbegin): New.
(_Rb_tree<>::_M_begin): Use latter.
(_Rb_tree<>::_M_copy): Add _MoveValues template parameter.
* testsuite/23_containers/map/allocator/move_cons.cc: New test.
* testsuite/23_containers/multimap/allocator/move_cons.cc: New test.
* testsuite/23_containers/multiset/allocator/move_cons.cc: New test.
* testsuite/23_containers/set/allocator/move_cons.cc: New test.
Another remaining case is that we end up comparing calls with mismatching
number of parameters or with different permutations of them.
This is because we hash decls to nothing. This patch improves that by
hashing decls by their code and parm decls by indexes that are stable.
Also for defualt defs in SSA_NAMEs we can add the corresponding decl (that
is usually parm decls).
Still we could improve on this by hasing ssa names by their definit parameters
and possibly making maps of other decls and assigning them stable function
local IDs.
* ipa-icf-gimple.c (func_checker::hash_operand): Improve hashing of
decls.
one of common remaining reasons for ICF to fail after loading in fuction
body is mismatched type of automatic vairable. This is becuase
compatible_types_p resorts to checking TYPE_MAIN_VARIANTS for
euqivalence that prevents merging many TBAA compaitle cases. (And thus
is also not reflected by the hash extended by alias sets of accesses.)
Since in gimple
automatic variables are just blocks of memory I think we should only
check its size only. All accesses are matched when copmparing the actual
loads/stores.
I am not sure if we need to match types of other DECLs but I decided I can try
to be safe here: for PARM_DECl/RESUILT_DECL we match them anyway to be sure
that functions are ABI compatible. For CONST_DECL and readonly global
VAR_DECLs they are matched when comparing their constructors.
* ipa-icf-gimple.c (func_checker::compare_decl): Do not compare types
of local variables.
2020-11-10 Andrea Corallo <andrea.corallo@arm.com>
PR target/97726
* gcc.target/arm/simd/bf16_vldn_1.c: Relax regexps not to fail on
big endian.
* gcc.target/arm/simd/vldn_lane_bf16_1.c: Likewise
* gcc.target/arm/simd/vmmla_1.c: Add -mfloat-abi=hard flag.
This modifies vectorizable_slp_permutation to update the type of the children
of a perm node before trying to permute them. This allows us to be able to
permute invariant nodes.
This will be covered by test from the SLP pattern matcher.
gcc/ChangeLog:
* tree-vect-slp.c (vectorizable_slp_permutation): Update types on nodes
when needed.
Unlike the other headers that declare alias templates in namespace pmr,
<regex> includes <memory_resource>. That was done because the
pmr::string::const_iterator typedef requires pmr::string to be complete,
which requires pmr::polymorphic_allocator<char> to be complete.
By using __normal_iterator<const char*, pmr::string> instead of the
const_iterator typedef we can avoid the completeness requirement.
This makes <regex> smaller, by not requiring <memory_resource> and its
<shared_mutex> dependency, which depends on <chrono>. Backporting this
will also help with PR 97876, where <stop_token> ends up being needed by
<regex> via <memory_resource>.
libstdc++-v3/ChangeLog:
PR libstdc++/92546
* include/std/regex (pmr::smatch, pmr::wsmatch): Declare using
underlying __normal_iterator type, not nested typedef
basic_string::const_iterator.
This makes hybrid SLP discovery deal with stmts indirectly consumed
by SLP, for example via patterns. This means that all uses of a
stmt end up in SLP vectorized stmts.
This helps my prototype patches for PR97832 where I make SLP discovery
re-associate chains to make operands match. This ends up building
SLP computation nodes without 1:1 representatives in the scalar IL
and thus no scalar lane defs in SLP_TREE_SCALAR_STMTS. Nevertheless
all of the original scalar stmts are consumed so this represents
another kind of SLP pattern for the computation chain result.
2020-11-20 Richard Biener <rguenther@suse.de>
* tree-vect-slp.c (maybe_push_to_hybrid_worklist): New function.
(vect_detect_hybrid_slp): Use it. Perform a backward walk
over the IL.
It always annoyed me to see those empty SLP nodes in dumpfiles:
t.c:16:3: note: node 0x3a2a280 (max_nunits=1, refcnt=1)
t.c:16:3: note: { }
t.c:16:3: note: children 0x3a29db0 0x3a29e90
resulting from two-operator handling. The following makes
sure to also dump the operation template or VEC_PERM_EXPR.
2020-11-20 Richard Biener <rguenther@suse.de>
* tree-vect-slp.c (vect_print_slp_tree): Also dump
SLP_TREE_REPRESENTATIVE.
The following patch implements __builtin_clear_padding builtin that clears
the padding bits in object representation (but preserves value
representation). Inside of unions it clears only those padding bits that
are padding for all the union members (so that it never alters value
representation).
It handles trailing padding, padding in the middle of structs including
bitfields (PDP11 unhandled, I've never figured out how those bitfields
work), VLAs (doesn't handle variable length structures, but I think almost
nobody uses them and it isn't worth the extra complexity). For VLAs and
sufficiently large arrays it uses runtime clearing loop instead of emitting
straight-line code (unless arrays are inside of a union).
The way I think this can be used for atomics is e.g. if the structures
are power of two sized and small enough that we use the hw atomics
for say compare_exchange __builtin_clear_padding could be called first on
the address of expected and desired arguments (for desired only if we want
to ensure that most of the time the atomic memory will have padding bits
cleared), then perform the weak cmpxchg and if that fails, we got the
value from the atomic memory; we can call __builtin_clear_padding on a copy
of that and then compare it with expected, and if it is the same with the
padding bits masked off, we can use the original with whatever random
padding bits in it as the new expected for next cmpxchg.
__builtin_clear_padding itself is not atomic and therefore it shouldn't
be called on the atomic memory itself, but compare_exchange*'s expected
argument is a reference and normally the implementation may store there
the current value from memory, so padding bits can be cleared in that,
and desired is passed by value rather than reference, so clearing is fine
too.
When using libatomic, we can use it either that way, or add new libatomic
APIs that accept another argument, pointer to the padding bit bitmask,
and construct that in the template as
alignas (_T) unsigned char _mask[sizeof (_T)];
std::memset (_mask, ~0, sizeof (_mask));
__builtin_clear_padding ((_T *) _mask);
which will have bits cleared for padding bits and set for bits taking part
in the value representation. Then libatomic could internally instead
of using memcmp compare
for (i = 0; i < N; i++) if ((val1[i] & mask[i]) != (val2[i] & mask[i]))
2020-11-20 Jakub Jelinek <jakub@redhat.com>
PR libstdc++/88101
gcc/
* builtins.def (BUILT_IN_CLEAR_PADDING): New built-in function.
* gimplify.c (gimplify_call_expr): Rewrite single argument
BUILT_IN_CLEAR_PADDING into two-argument variant.
* gimple-fold.c (clear_padding_unit, clear_padding_buf_size): New
const variables.
(struct clear_padding_struct): New type.
(clear_padding_flush, clear_padding_add_padding,
clear_padding_emit_loop, clear_padding_type,
clear_padding_union, clear_padding_real_needs_padding_p,
clear_padding_type_may_have_padding_p,
gimple_fold_builtin_clear_padding): New functions.
(gimple_fold_builtin): Handle BUILT_IN_CLEAR_PADDING.
* doc/extend.texi (__builtin_clear_padding): Document.
gcc/c-family/
* c-common.c (check_builtin_function_arguments): Handle
BUILT_IN_CLEAR_PADDING.
gcc/testsuite/
* c-c++-common/builtin-clear-padding-1.c: New test.
* c-c++-common/torture/builtin-clear-padding-1.c: New test.
* c-c++-common/torture/builtin-clear-padding-2.c: New test.
* c-c++-common/torture/builtin-clear-padding-3.c: New test.
* c-c++-common/torture/builtin-clear-padding-4.c: New test.
* c-c++-common/torture/builtin-clear-padding-5.c: New test.
* g++.dg/torture/builtin-clear-padding-1.C: New test.
* g++.dg/torture/builtin-clear-padding-2.C: New test.
* gcc.dg/builtin-clear-padding-1.c: New test.
The documentation for POST_MODIFY says:
Currently, the compiler can only handle second operands of the
form (plus (reg) (reg)) and (plus (reg) (const_int)), where
the first operand of the PLUS has to be the same register as
the first operand of the *_MODIFY.
The following testcase ICEs, because combine just attempts to simplify
things and ends up with
(post_modify (reg1) (plus (mult (reg2) (const_int 4)) (reg1))
but the target predicates accept it, because they only verify
that POST_MODIFY's second operand is PLUS and the second operand
of the PLUS is a REG.
The following patch fixes this by performing further verification that
the POST_MODIFY is in the form it should be.
2020-11-20 Jakub Jelinek <jakub@redhat.com>
PR target/97528
* config/arm/arm.c (neon_vector_mem_operand): For POST_MODIFY, require
first POST_MODIFY operand is a REG and is equal to the first operand
of PLUS.
* gcc.target/arm/pr97528.c: New test.
There is a loophole in new string store merging support added recently:
it does not check that the stores are consecutive, which is obviously
required if you want to concatenate them... Simple fix attached, the
nice thing being that it can fall back to the regular processing if
any hole is detected in the series of stores, thanks to the handling
of STRING_CST by native_encode_expr.
gcc/ChangeLog:
* gimple-ssa-store-merging.c (struct merged_store_group): Add
new 'consecutive' field.
(merged_store_group): Set it to true.
(do_merge): Set it to false if the store is not consecutive and
set string_concatenation to false in this case.
(merge_into): Call do_merge on entry.
(merge_overlapping): Likewise.
gcc/testsuite/ChangeLog:
* gnat.dg/opt90a.adb: New test.
* gnat.dg/opt90b.adb: Likewise.
* gnat.dg/opt90c.adb: Likewise.
* gnat.dg/opt90d.adb: Likewise.
* gnat.dg/opt90e.adb: Likewise.
* gnat.dg/opt90a_pkg.ads: New helper.
* gnat.dg/opt90b_pkg.ads: Likewise.
* gnat.dg/opt90c_pkg.ads: Likewise.
* gnat.dg/opt90d_pkg.ads: Likewise.
* gnat.dg/opt90e_pkg.ads: Likewise.