gcc/ada/
* checks.adb (Apply_Accessibility_Check): Modify condition to
avoid flawed optimization and use Get_Accessibility over
Extra_Accessibility.
* exp_attr.adb: Remove inclusion of Exp_Ch2.adb.
* exp_ch2.adb, exp_ch2.ads (Param_Entity): Moved to sem_util.
* exp_ch3.ads (Init_Proc_Level_Formal): New function.
* exp_ch3.adb (Build_Init_Procedure): Add extra accessibility
formal for init procs when the associated type is a limited
record.
(Build_Initialization_Call): Add condition to handle propagation
of the new extra accessibility paramter actual needed for init
procs.
(Init_Proc_Level_Formal): Created to fetch a the extra
accessibility parameter associated with init procs if one
exists.
* exp_ch4.adb (Build_Attribute_Reference): Modify static check
to be dynamic.
* exp_ch6.adb (Add_Cond_Expression_Extra_Actual): Move logic
used to expand conditional expressions used as actuals for
anonymous access formals.
(Expand_Call_Helper): Remove extranious accessibility
calculation logic.
* exp_util.adb: Remove inclusion of Exp_Ch2.adb.
* par-ch3.adb (P_Array_Type_Definition): Properly set
Aliased_Present on access definitions
* sem_attr.adb (Resolve_Attribute): Replace instances for
Object_Access_Level with Static_Accessibility_Level.
* sem_ch13.adb (Storage_Pool): Replace instances for
Object_Access_Level with Static_Accessibility_Level.
* sem_ch6.adb (Check_Return_Construct_Accessibility): Replace
instances for Object_Access_Level with
Static_Accessibility_Level.
* sem_ch9.adb (Analyze_Requeue): Replace instances for
Object_Access_Level with Static_Accessibility_Level.
* sem_res.adb (Check_Aliased_Parameter,
Check_Allocator_Discrim_Accessibility, Valid_Conversion):
Replace instances for Object_Access_Level with
Static_Accessibility_Level.
* sem_util.adb, sem_util.ads (Accessibility_Level_Helper):
Created to centralize calculation of accessibility levels.
(Build_Component_Subtype): Replace instances for
Object_Access_Level with Static_Accessibility_Level.
(Defining_Entity): Add extra parameter to dictate whether an
error is raised or empty is return in the case of an irrelevant
N.
(Dynamic_Accessibility_Level): Rewritten to use
Accessibility_Level_Helper.
(Is_View_Conversion): Check membership against Etype to capture
nodes like explicit dereferences which have types but are not
expanded names or identifers.
(Object_Access_LeveL): Removed.
(Param_Entity): Moved from sem_util.
(Static_Accessibility_Level): Created as a replacement to
Object_Access_Level, it also uses Accessibility_Level_Helper for
its implementation.
* snames.ads-tmpl: Added new name for extra accessibility
parameter in init procs.
gcc/ada/
* sem_warn.adb (Check_Unused_Withs): Move local variables from
to a nested procedure; Lunit is passed as a parameter to
Check_System_Aux and its type is refined from Node_Id to
Entity_Id; Cnode is now a constant.
gcc/ada/
* libgnat/s-rident.ads (Profile_Info): Use a common profile
definition for Jorvik and GNAT Extended Ravenscar, using the
GNAT Extended Ravenscar definition.
gcc/ada/
* Makefile.rtl (64-bit platforms): Add GNATRTL_128BIT_PAIRS to
the LIBGNAT_TARGET_PAIRS list and also GNATRTL_128BIT_OBJS to
the EXTRA_GNATRTL_NONTASKING_OBJS list.
gcc/ada/
* sem_ch13.adb (Make_Aitem_Pragma): Turn into function. This
removes a side-effect on the Aitem variable.
(Analyze_Aspect_Specifications): Handle Suppress and Unsuppress
aspects differently from the Linker_Section aspect.
(Ceck_Aspect_At_Freeze_Point): Don't expect Suppress/Unsuppress
to be delayed anymore.
gcc/ada/
* sem_aggr.adb: (Resolve_Container_Aggregate): For an indexed
container, verify that expressions and component associations
are not both present.
* exp_aggr.adb: Code reorganization, additional comments.
(Expand_Container_Aggregate): Use Aggregate_Size for Iterated_
Component_Associations for indexed aggregates. If present, the
default value of the formal in the constructor function is used
when the size of the aggregate cannot be determined statically.
gcc/ada/
* sem_util.ads, sem_util.adb (Check_Ambiguous_Aggregate): When a
subprogram call is found to be ambiguous, check whether
ambiguity is caused by an aggregate actual. and indicate that
it should carry a type qualification.
* sem_ch4.adb (Traverse_Hoonyms, Try_Primitive_Operation): Call
it.
* sem_res.adb (Report_Ambiguous_Argument): Call it.
gcc/ada/
* sem_warn.adb (Check_One_Unit): Avoid repeated calls by using a
local variable Lunit; remove local constant Eitem, which was
identical to Lunit.
The earlier patch that introduced the wraplf variants missed the
x86*-vxworks* ports. This fixes them.
for gcc/ada/ChangeLog
* Makefile.rtl (LIBGNAT_TARGET_PAIRS) <x86*-vxworks*>: Select
nolibm and wraplf variants like other vxworks ports.
The sincos transformation does not take place on all platforms,
because the libc_has_function target hook disables it by default.
Current mingw-w64's math library supports sincos, sincosl and sincosf,
in 32- and 64-bit modes. I suppose this has been this way for long.
This patch enables the sincos optimization on this platform.
for gcc/ChangeLog
* config/i386/mingw-w64.h (TARGET_LIBC_HAS_FUNCTION): Enable
sincos optimization.
In the testcase below, we're ICEing during constexpr evaluation of the
CONSTRUCTOR {.data={{}, [1 ... 7]={}}} of type 'vector'. The interesting
thing about this CONSTRUCTOR is that it has a RANGE_EXPR index for an
element initializer which doesn't satisfy reduced_constant_expression_p
(because the field 't' is uninitialized).
This is a problem because init_subob_ctx currently punts on setting up a
sub-aggregate initialization context when given a RANGE_EXPR index, so
we later trip over the asserts in verify_ctor_sanity when recursing into
cxx_eval_bare_aggregate on this element initializer.
Fix this by making init_subob_ctx set up an appropriate initialization
context when supplied a RANGE_EXPR index.
gcc/cp/ChangeLog:
PR c++/97328
* constexpr.c (init_subob_ctx): Don't punt on RANGE_EXPR
indexes, instead build a sub-aggregate initialization context
with no subobject.
gcc/testsuite/ChangeLog:
PR c++/97328
* g++.dg/cpp2a/constexpr-init19.C: New test.
* g++.dg/cpp2a/constexpr-init20.C: New test.
In the testcase below, folding of the initializer for 'ret' inside the
instantiated f<lambda>::lambda ends up yielding an initializer for which
potential_constant_expression returns false. This causes finish_function
to mark the lambda as non-constexpr, which ultimately causes us to reject
'f(g)' as a call to a non-constexpr function.
The initializer for 'ret' inside f<lambda>::lambda, prior to folding, is
the CALL_EXPR
<lambda(S)>::operator() (&cb, ({}, <<< Unknown tree: empty_class_expr >>>;))
where the second argument is a COMPOUND_EXPR whose second operand is an
EMPTY_CLASS_EXPR that was formed by build_class_a. cp_fully_fold_init
is able to only partially fold this initializer: it gets rid of the
side-effectless COMPOUND_EXPR to obtain
<lambda(S)>::operator() (&cb, <<< Unknown tree: empty_class_expr >>>)
as the final initializer for 'ret'. This initializer no longer satifies
potential_constant_expression due to the bare EMPTY_CLASS_EXPR which is
not wrapped in a COMPOUND_EXPR.
(cp_fully_fold_init first tries maybe_constant_value on the original
CALL_EXPR, but constexpr evaluation punts upon seeing
__builtin_is_constant_evaluated, since manifestly_const_eval is false.)
To fix this, it seems we could either make cp_fold preserve the
COMPOUND_EXPR trees produced by build_call_a, or we could improve
the constexpr machinery to treat EMPTY_CLASS_EXPR trees as first-class
citizens. Assuming it's safe to continue folding away these
COMPOUND_EXPRs, the second approach seems cleaner, so this patch
implements the second approach.
gcc/cp/ChangeLog:
PR c++/96575
* constexpr.c (cxx_eval_constant_expression)
<case EMPTY_CLASS_EXPR>: Lower it to a CONSTRUCTOR.
(potential_constant_expression_1) <case COMPOUND_EXPR>: Remove
now-redundant handling of COMPOUND_EXPR with EMPTY_CLASS_EXPR
second operand.
<case EMPTY_CLASS_EXPR>: Return true instead of false.
gcc/testsuite/ChangeLog:
PR c++/96575
* g++.dg/cpp1z/constexpr-96575.C: New test.
This makes duplicate_decls differentiate a TYPE_DECL for an alias
template from a TYPE_DECL for one of its template parameters. The
recently added assert in template_parm_to_arg revealed this latent issue
because merging of the two TYPE_DECLs cleared the DECL_TEMPLATE_PARM_P
flag.
With this patch, we now also correctly diagnose the name shadowing in
the below testcase (as required by [temp.local]/6).
gcc/cp/ChangeLog:
PR c++/97511
* decl.c (duplicate_decls): Return NULL_TREE if
DECL_TEMPLATE_PARM_P differ.
gcc/testsuite/ChangeLog:
PR c++/97511
* g++.dg/template/shadow3.C: New test.
In preparation for a larger change this refactors vect_analyze_slp_instance
so it doesn't need to know a vector type early.
2020-10-22 Richard Biener <rguenther@suse.de>
* tree-vect-slp.c (vect_analyze_slp_instance): Refactor so
computing a vector type early is not needed, for store group
splitting compute a new vector type based on the desired
group size.
This fixes expansion of VECTOR_BOOLEAN_TYPE_P VECTOR_CSTs which
when using an integer mode are not always "mask-mode" but may
be using an integer mode when there's no supported vector mode.
The patch makes sure to only go the mask-mode expansion if
the elements do not line up to cover the full integer mode
(when they do and the mode was an actual mask-mode there's
no actual difference in both expansions).
2020-10-22 Richard Biener <rguenther@suse.de>
PR middle-end/97521
* expr.c (expand_expr_real_1): Be more careful when
expanding a VECTOR_BOOLEAN_TYPE_P VECTOR_CSTs.
* gcc.target/i386/pr97521.c: New testcase.
"make selftest-valgrind" was reporting:
40 bytes in 1 blocks are definitely lost in loss record 25 of 735
at 0x483AE7D: operator new(unsigned long) (vg_replace_malloc.c:344)
by 0xFA0CEA: selftest::test_insert_search_collapse() (ipa-modref-tree.c:40)
by 0xFA2F9B: selftest::ipa_modref_tree_c_tests() (ipa-modref-tree.c:164)
by 0x256E3AB: selftest::run_tests() (selftest-run-tests.c:93)
by 0x1366A8B: toplev::run_self_tests() (toplev.c:2385)
by 0x1366C47: toplev::main(int, char**) (toplev.c:2467)
by 0x263203F: main (main.c:39)
40 bytes in 1 blocks are definitely lost in loss record 26 of 735
at 0x483AE7D: operator new(unsigned long) (vg_replace_malloc.c:344)
by 0xFA264A: selftest::test_merge() (ipa-modref-tree.c:123)
by 0xFA2FA0: selftest::ipa_modref_tree_c_tests() (ipa-modref-tree.c:165)
by 0x256E3AB: selftest::run_tests() (selftest-run-tests.c:93)
by 0x1366A8B: toplev::run_self_tests() (toplev.c:2385)
by 0x1366C47: toplev::main(int, char**) (toplev.c:2467)
by 0x263203F: main (main.c:39)
40 bytes in 1 blocks are definitely lost in loss record 27 of 735
at 0x483AE7D: operator new(unsigned long) (vg_replace_malloc.c:344)
by 0xFA279E: selftest::test_merge() (ipa-modref-tree.c:130)
by 0xFA2FA0: selftest::ipa_modref_tree_c_tests() (ipa-modref-tree.c:165)
by 0x256E3AB: selftest::run_tests() (selftest-run-tests.c:93)
by 0x1366A8B: toplev::run_self_tests() (toplev.c:2385)
by 0x1366C47: toplev::main(int, char**) (toplev.c:2467)
by 0x263203F: main (main.c:39)
With this patch, the output is clean.
gcc/ChangeLog:
* ipa-modref-tree.c (selftest::test_insert_search_collapse): Fix
leak.
(selftest::test_merge): Fix leaks.
The S/390 backend does not define vec_cmp expanders so far. We relied
solely on expanding vcond. With commit 502d63b6d various testcases
started to ICE now.
This patch just adds the missing expanders to prevent the ICE.
However, there are still a couple of performance-related testcase
regressions with the vcond lowering which have to be fixed
independently.
gcc/ChangeLog:
PR target/97502
* config/s390/vector.md ("vec_cmp<VI_HW:mode><VI_HW:mode>")
("vec_cmpu<VI_HW:mode><VI_HW:mode>"): New expanders.
gcc/testsuite/ChangeLog:
* gcc.dg/pr97502.c: New test.
decimal_real_maxval misses to set the sign flag in the REAL_VALUE_TYPE.
gcc/ChangeLog:
PR rtl-optimization/97439
* dfp.c (decimal_real_maxval): Set the sign flag in the
generated number.
gcc/testsuite/ChangeLog:
* gcc.dg/dfp/pr97439.c: New test.
gcc/analyzer/ChangeLog:
PR analyzer/97514
* engine.cc (exploded_graph::add_function_entry): Handle failure
to create an enode, rather than asserting.
gcc/testsuite/ChangeLog:
PR analyzer/97514
* gcc.dg/analyzer/pr97514.c: New test.
gcc/analyzer/ChangeLog:
PR analyzer/97489
* engine.cc (exploded_graph::add_function_entry): Assert that we
have a function body.
(exploded_graph::on_escaped_function): Reject fndecls that don't
have a function body.
gcc/testsuite/ChangeLog:
PR analyzer/97489
* g++.dg/analyzer/pr97489.C: New test.
gcc/ChangeLog:
2020-05-15 Martin Liska <mliska@suse.cz>
* cfgexpand.c: Move the enum to ...
* coretypes.h (enum stack_protector): ... here.
* function.c (assign_parm_adjust_stack_rtl): Use the stack_protector
enum.
gcc/c-family/ChangeLog:
2020-05-15 Martin Liska <mliska@suse.cz>
* c-cppbuiltin.c (c_cpp_builtins): Use the stack_protector enum.
- Support expansion operator (*) in the multilib config string.
- Motivation of this patch is reduce the complexity when we deal multilib with
sub-extension, expand the combinations by hand would be very painful and
error prone, no one deserve to experience this[1] again!
[1] f4d7facafb/Makefile (L348)
gcc/ChangeLog:
* config/riscv/multilib-generator: Add TODO, import itertools
and functools.reduce.
Handle expantion operator.
(LONG_EXT_PREFIXES): New.
(arch_canonicalize): Update comment and improve python3
debuggability/compatibility.
(add_underline_prefix): New.
(_expand_combination): Ditto.
(unique): Ditto.
(expand_combination): Ditto.
> this broke sparc-sun-solaris2.11 bootstrap
>
> /vol/gcc/src/hg/master/local/gcc/tree-ssa-phiopt.c: In function 'bool cond_removal_in_popcount_clz_ctz_pattern(basic_block, basic_block, edge, edge, gimple*, tree, tree)':
> /vol/gcc/src/hg/master/local/gcc/tree-ssa-phiopt.c:1858:27: error: variable 'mode' set but not used [-Werror=unused-but-set-variable]
> 1858 | scalar_int_mode mode = SCALAR_INT_TYPE_MODE (TREE_TYPE (arg));
> | ^~~~
>
>
> and doubtlessly several other targets that use the defaults.h definition of
>
> #define CTZ_DEFINED_VALUE_AT_ZERO(MODE, VALUE) 0
Ugh, seems many of those macros do not evaluate the first argument.
This got broken by the change to direct_internal_fn_supported_p, previously
it used mode also in the optab test.
2020-10-22 Jakub Jelinek <jakub@redhat.com>
* tree-ssa-phiopt.c (cond_removal_in_popcount_clz_ctz_pattern):
For CLZ and CTZ tests, use type temporary instead of mode.
> + {"x86-64", PROCESSOR_K8, CPU_K8, PTA_X86_64_BASELINE, 0, P_NONE},
> + {"x86-64-v2", PROCESSOR_K8, CPU_GENERIC, PTA_X86_64_V2 | PTA_NO_TUNE,
> + 0, P_NONE},
> + {"x86-64-v3", PROCESSOR_K8, CPU_GENERIC, PTA_X86_64_V3 | PTA_NO_TUNE,
> + 0, P_NONE},
> + {"x86-64-v4", PROCESSOR_K8, CPU_GENERIC, PTA_X86_64_V4 | PTA_NO_TUNE,
> + 0, P_NONE},
> {"eden-x2", PROCESSOR_K8, CPU_K8,
> PTA_64BIT | PTA_MMX | PTA_SSE | PTA_SSE2 | PTA_SSE3 | PTA_FXSR,
> 0, P_NONE},
I have noticed that one can't configure gcc to default to these.
I've also found various other 32-bit or 64-bit -march= arguments for which
it wasn't possible to configure gcc to default to those.
The x86-64-v* the patch only allows in --with-arch_64=, because otherwise
it fails build miserably - as
./xgcc -B ./ -S -march=x86-64-v2 -m32 test.c
cc1: error: ‘x86-64-v2’ architecture level is only defined for the x86-64 psABI
when building 32-bit multilibs. Even if multilibs are disallowed, I think
the compiler still supports -m32 and so --with-arch_64= seems to be the only
option in which we can support that.
2020-10-22 Jakub Jelinek <jakub@redhat.com>
* config.gcc (x86_archs): Add samuel-2, nehemiah, c7 and esther.
(x86_64_archs): Add eden-x2, nano, nano-1000, nano-2000, nano-3000,
nano-x2, eden-x4, nano-x4, x86-64-v2, x86-64-v3 and x86-64-v4.
(i[34567]86-*-* | x86_64-*-*): Only allow x86-64-v* as argument
to --with-arch_64=.
> Therefore, I think until omp_get_initial_device () value is changed, we
The following so far untested patch implements that change.
OpenMP 4.5 said for omp_get_initial_device:
The value of the device number is implementation defined. If it is between 0 and one less than
omp_get_num_devices() then it is valid for use with all device constructs and routines; if it is
outside that range, then it is only valid for use with the device memory routines and not in the
device clause.
and OpenMP 5.0 similarly, but OpenMP 5.1 says:
The value of the device number is the value returned by the omp_get_num_devices routine.
As the new value is compatible with what has been required earlier, I think
we can change it already now.
2020-10-22 Jakub Jelinek <jakub@redhat.com>
* icv.c (omp_get_initial_device): Remove including corresponding
ialias.
* icv-device.c (omp_get_initial_device): New function. Return
gomp_get_num_devices (). Add ialias.
* target.c (resolve_device): Don't fail with
OMP_TARGET_OFFLOAD=mandatory if device_id is equal to
gomp_get_num_devices ().
(omp_target_alloc, omp_target_free, omp_target_is_present,
omp_target_memcpy, omp_target_memcpy_rect, omp_target_associate_ptr,
omp_target_disassociate_ptr, omp_pause_resource): Use
gomp_get_num_devices () instead of GOMP_DEVICE_HOST_FALLBACK on the
first use in the functions, in uses dominated by the
gomp_get_num_devices call use num_devices_openmp instead.
* libgomp.texi (omp_get_initial_device): Document.
* config/gcn/icv-device.c (omp_get_initial_device): New function.
Add ialias.
* config/nvptx/icv-device.c (omp_get_initial_device): Likewise.
* testsuite/libgomp.c/target-40.c: New test.
Its libc does not offer *f or *l elementary functions, so rely on the
C double ones only.
for gcc/ada/ChangeLog
* Makefile.rtl (LIBGNAT_TARGET_PAIRS) <lynxos178>: Rely on
Aux_Long_Float for all real types.
Some acats-4 tests that check the precision of Float elementary
functions fail with vxworks 7.2's implementations of single-precision
math functions.
This patch arranges for us to bypass the single-precision functions,
and use the Aux_Long_Float implementation, based on the double-typed
calls from the C library, for Float and Short_Float.
for gcc/ada/ChangeLog
* Makefile.rtl (LIBGNAT_TARGET_PAIRS): Use Long Float-based
variant of Aux_Short_Float and Aux_Float on vxworks targets.
* libgnat/a-nashfl__wraplf.ads: New.
* libgnat/a-nuaufl__wraplf.ads: New.
Like aarch64-* and ppc*-linux-gnu, sparc*-sun-solaris has
Long_Long_Float mapped to double rather than long double, so the
intrinsics in the default version of a-nallfl.ads have mismatching
types. Adopt the wraplf workaround for it as well.
for gcc/ada/ChangeLog
* Makefile.rtl (LIBGNAT_TARGET_PAIRS) <sparc*-sun-solaris>:
Use wraplf version of a-nallfl.
Some platforms have failed to build because long long float is mapped
to double rather than long double, and then the attempts to import
intrinsics for long double in Aux_Long_Long_Float raise warnings
turned into errors.
This patch is a work around for the mismatch, arranging for
Aux_Long_Long_Float to map to Aux_Long_Float.
for gcc/ada/ChangeLog
* Makefile.rtl (LIBGNAT_TARGET_PAIRS): Use
a-nallfl__wraplf.ads on aarch64-* and ppc*-linux-gnu targets.
* libgnat/a-nallfl__wraplf.ads: New.
Only compile the __go_ptrace varargs shim on Linux to avoid compilation
failures on some other platforms. The C ptrace function is not entirely
portable (e.g., NetBSD has `int data` instead of `void* data`), and so
far Linux is the only platform that needs the varargs shim.
Additionally, make the types in the ptrace and raw_ptrace function
declarations match. This makes it more clear that the only difference
between the two is that calls via the former are allowed to block while
calls via the latter are not.
Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/263517
this patch moves nested function information out of symbol table (to a summary).
This saves memory (especially at WPA time) and also makes nested function
support more contained.
gcc/ChangeLog:
2020-10-22 Jan Hubicka <hubicka@ucw.cz>
* cgraph.c: Include tree-nested.h
(cgraph_node::create): Call maybe_record_nested_function.
(cgraph_node::remove): Do not remove function from nested function
infos.
(cgraph_node::dump): Update.
(cgraph_node::unnest): Move to tree-nested.c
(cgraph_node::verify_node): Update.
(cgraph_c_finalize): Call nested_function_info::release.
* cgraph.h (struct symtab_node): Remove nested function info.
* cgraphclones.c (cgraph_node::create_clone): Do not clone nested
function info.
* cgraphunit.c (cgraph_node::analyze): Update.
(cgraph_node::expand): Do not worry about nested functions; they are
lowered.
(symbol_table::finalize_compilation_unit): Call
nested_function_info::release.
* gimplify.c: Include tree-nested.h
(unshare_body): Update.
(unvisit_body): Update.
* omp-offload.c (omp_discover_implicit_declare_target): Update.
* tree-nested.c: Include alloc-pool.h, tree-nested.h, symbol-summary.h
(nested_function_sum): New static variable.
(nested_function_info::get): New member function.
(nested_function_info::get_create): New member function.
(unnest_function): New function.
(nested_function_info::~nested_function_info): New member function.
(nested_function_info::release): New function.
(maybe_record_nested_function): New function.
(lookup_element_for_decl): Update.
(check_for_nested_with_variably_modified): Update.
(create_nesting_tree): Update.
(unnest_nesting_tree_1): Update.
(gimplify_all_functions): Update.
(lower_nested_functions): Update.
* tree-nested.h (class nested_function_info): New class.
(maybe_record_nested_function): Declare.
(unnest_function): Declare.
(first_nested_function): New inline function.
(next_nested_function): New inline function.
(nested_function_origin): New inline function.
gcc/ada/ChangeLog:
2020-10-22 Jan Hubicka <hubicka@ucw.cz>
* gcc-interface/trans.c: Include tree-nested.h
(walk_nesting_tree): Update for new nested function info.
gcc/c-family/ChangeLog:
2020-10-22 Jan Hubicka <hubicka@ucw.cz>
* c-gimplify.c: Include tree-nested.h
(c_genericize): Update for new nested function info.
gcc/d/ChangeLog:
2020-10-22 Jan Hubicka <hubicka@ucw.cz>
* decl.cc: Include tree-nested.h
(get_symbol_decl): Update for new nested function info.
gcc/ChangeLog
PR rtl-optimization/97249
* simplify-rtx.c (simplify_binary_operation_1): Simplify
vec_select of a subreg of X to a vec_select of X.
gcc/testsuite/ChangeLog
* gcc.target/i386/pr97249-1.c: New test.
For operand with special_memory_constraint, there could be a wrapper
for memory_operand. Extract mem for operand for conditional judgement
like MEM_P, also for record_address_regs.
gcc/ChangeLog:
PR target/87767
* ira-costs.c (record_operand_costs): Extract memory operand
from recog_data.operand[i] for record_address_regs.
(record_reg_classes): Extract memory operand from OP for
conditional judgement MEM_P.
* ira.c (ira_setup_alts): Ditto.
* lra-constraints.c (extract_mem_from_operand): New function.
(satisfies_memory_constraint_p): Extract memory operand from
OP for decompose_mem_address, return false when there's no
memory operand inside OP.
(process_alt_operands): Remove MEM_P (op) since it would be
judged in satisfies_memory_constraint_p.
* recog.c (asm_operand_ok): Extract memory operand from OP for
judgement of memory_operand (OP, VOIDmode).
(constrain_operands): Don't unwrapper unary operator when
there's memory operand inside.
* rtl.h (extract_mem_from_operand): New decl.
This patch enables MVE vmin/vmax instructions for auto-vectorization.
MVE target is included in expander smin<mode>3, umin<mode>3, smax<mode>3
and umax<mode>3 for vectorization. Related insns for vmin/vmax in mve.md
are modified to use smin, umin, smax and umax expressions instead of
unspec to support the expanders.
gcc/ChangeLog:
2020-10-22 Dennis Zhang <dennis.zhang@arm.com>
* config/arm/mve.md (mve_vmaxq_<supf><mode>): Replace with ...
(mve_vmaxq_s<mode>, mve_vmaxq_u<mode>): ... these new insns to
use smax/umax instead of VMAXQ.
(mve_vminq_<supf><mode>): Replace with ...
(mve_vminq_s<mode>, mve_vminq_u<mode>): ... these new insns to
use smin/umin instead of VMINQ.
(mve_vmaxnmq_f<mode>): Use smax instead of VMAXNMQ_F.
(mve_vminnmq_f<mode>): Use smin instead of VMINNMQ_F.
* config/arm/vec-common.md (smin<mode>3): Use the new mode macros
ARM_HAVE_<MODE>_ARITH.
(umin<mode>3, smax<mode>3, umax<mode>3): Likewise.
gcc/testsuite/ChangeLog:
* gcc.target/arm/simd/mve-vminmax_1.c: New test.
when processing assignments, we were using the type of b instead of type
of &b when computing a range. This was usually filtered out by FRE.
turning it off exposed it.
gcc/
PR tree-optimization/97520
* gimple-range.cc (range_of_non_trivial_assignment): Handle x = &a
by returning a non-zero range.
gcc/testsuite/
* gcc.dg/pr97520.c: New.
This patch enables MVE vmul instructions for auto-vectorization.
It includes MVE in expander mul<mode>3 to enable vectorization for MVE.
Related MVE vmul insns are modified to support the expander by using
expression 'mult' instead of unspec.
The mul<mode>3 for vectorization in vec-common.md uses mode iterator
VDQWH instead of VALLW to cover all supported modes.
The macros ARM_HAVE_NEON_<MODE>_ARITH are used to select supported
modes for different targets.
The redundant mul<mode>3 in neon.md is removed.
gcc/ChangeLog:
2020-10-22 Dennis Zhang <dennis.zhang@arm.com>
* config/arm/mve.md (mve_vmulq<mode>): New entry for vmul instruction
using expression 'mult'.
(mve_vmulq_f<mode>): Use mult instead of VMULQ_F.
* config/arm/neon.md (mul<mode>3): Removed.
* config/arm/vec-common.md (mul<mode>3): Use the new mode macros
ARM_HAVE_<MODE>_ARITH. Use mode iterator VDQWH instead of VALLW.
gcc/testsuite/ChangeLog:
* gcc.target/arm/simd/mve-vmul_1.c: New test.
Don't return UNDEFINED for a range in an unreachable block if the global
value evaluates to a constant. Return that constant instead.
PR tree-optimization/97515
* value-query.cc (range_query::value_of_expr): If the result is
UNDEFINED, check to see if the global value is a constant.
(range_query::value_on_edge): Ditto.