OpenE2K/gcc - gcc - Expired Mentality Git

Author	SHA1	Message	Date
Stafford Horne	1bac7d31a1	or1k: Fix clobbering of _mcount argument if fPIC is enabled Recently we changed the PROFILE_HOOK _mcount call to pass in the link register as an argument. This actually does not work when the _mcount call uses a PLT because the GOT register setup code ends up getting inserted before the PROFILE_HOOK and clobbers the link register argument. These glibc tests are failing: gmon/tst-gmon-pie-gprof gmon/tst-gmon-static-gprof This patch fixes this by saving the instruction that stores the Link Register to the _mcount argument and then inserts the GOT register setup instructions after that. For example: main.c: extern int e; int f2(int a) { return a + e; } int f1(int a) { return f2 (a + a); } int main(int argc, char ** argv) { return f1 (argc); } Compiled: or1k-smh-linux-gnu-gcc -Wall -c -O2 -fPIC -pg -S main.c Before Fix: main: l.addi r1, r1, -16 l.sw 8(r1), r2 l.sw 0(r1), r16 l.addi r2, r1, 16 # Keeping FP, but not needed l.sw 4(r1), r18 l.sw 12(r1), r9 l.jal 8 # GOT Setup clobbers r9 (Link Register) l.movhi r16, gotpchi(_GLOBAL_OFFSET_TABLE_-4) l.ori r16, r16, gotpclo(_GLOBAL_OFFSET_TABLE_+0) l.add r16, r16, r9 l.or r18, r3, r3 l.or r3, r9, r9 # This is not the original LR l.jal plt(_mcount) l.nop l.jal plt(f1) l.or r3, r18, r18 l.lwz r9, 12(r1) l.lwz r16, 0(r1) l.lwz r18, 4(r1) l.lwz r2, 8(r1) l.jr r9 l.addi r1, r1, 16 After the fix: main: l.addi r1, r1, -12 l.sw 0(r1), r16 l.sw 4(r1), r18 l.sw 8(r1), r9 l.or r18, r3, r3 l.or r3, r9, r9 # We now have r9 (LR) set early l.jal 8 # Clobbers r9 (Link Register) l.movhi r16, gotpchi(_GLOBAL_OFFSET_TABLE_-4) l.ori r16, r16, gotpclo(_GLOBAL_OFFSET_TABLE_+0) l.add r16, r16, r9 l.jal plt(_mcount) l.nop l.jal plt(f1) l.or r3, r18, r18 l.lwz r9, 8(r1) l.lwz r16, 0(r1) l.lwz r18, 4(r1) l.jr r9 l.addi r1, r1, 12 Fixes: `308531d148` ("or1k: Add return address argument to _mcount call") gcc/ChangeLog: * config/or1k/or1k-protos.h (or1k_profile_hook): New function. * config/or1k/or1k.h (PROFILE_HOOK): Change macro to reference new function or1k_profile_hook. * config/or1k/or1k.c (struct machine_function): Add new field set_mcount_arg_insn. (or1k_profile_hook): New function. (or1k_init_pic_reg): Update to inject pic rtx after _mcount arg when profiling. (or1k_frame_pointer_required): Frame pointer no longer needed when profiling.	2021-11-13 07:58:00 +09:00
Jan Hubicka	4d2d5565a0	Fix wrong code with pure functions I introduced bug into find_func_aliases_for_call in handling pure functions. Instead of reading global memory pure functions are believed to write global memory. This results in misoptimization of the testcase at -O1. The change to pta-callused.c updates the template for new behaviour of the constraint generation. We copy nonlocal memory to calluse which is correct but also not strictly necessary because later we take care to add nonlocal_p flag manually. gcc/ChangeLog: PR tree-optimization/103209 * tree-ssa-structalias.c (find_func_aliases_for_call): Fix use of handle_rhs_call gcc/testsuite/ChangeLog: PR tree-optimization/103209 * gcc.dg/tree-ssa/pta-callused.c: Update template. * gcc.c-torture/execute/pr103209.c: New test.	2021-11-12 23:55:50 +01:00
Aldy Hernandez	264f061997	path solver: Solve PHI imports first for ranges. PHIs must be resolved first while solving ranges in a block, regardless of where they appear in the import bitmap. We went through a similar exercise for the relational code, but missed these. Tested on x86-64 & ppc64le Linux. gcc/ChangeLog: PR tree-optimization/103202 * gimple-range-path.cc (path_range_query::compute_ranges_in_block): Solve PHI imports first.	2021-11-12 20:42:56 +01:00
Jan Hubicka	b301cb43a7	Fix ipa-pure-const gcc/ChangeLog: * ipa-pure-const.c (propagate_pure_const): Remove redundant check; fix call of ipa_make_function_const and ipa_make_function_pure.	2021-11-12 20:15:48 +01:00
David Malcolm	72f1c1c452	analyzer: "__analyzer_dump_state" has no side-effects gcc/analyzer/ChangeLog: * engine.cc (exploded_node::on_stmt_pre): Return when handling "__analyzer_dump_state". Signed-off-by: David Malcolm <dmalcolm@redhat.com>	2021-11-12 14:01:36 -05:00
Richard Sandiford	87fcff96db	aarch64: Remove redundant costing code Previous patches made some of the complex parts of the issue rate code redundant. gcc/ * config/aarch64/aarch64.c (aarch64_vector_op::n_advsimd_ops): Delete. (aarch64_vector_op::m_seen_loads): Likewise. (aarch64_vector_costs::aarch64_vector_costs): Don't push to m_advsimd_ops. (aarch64_vector_op::count_ops): Remove vectype and factor parameters. Remove code that tries to predict different vec_flags from the current loop's. (aarch64_vector_costs::add_stmt_cost): Update accordingly. Remove m_advsimd_ops handling.	2021-11-12 17:33:03 +00:00
Richard Sandiford	c6c5c5ebae	aarch64: Use new hooks for vector comparisons Previously we tried to account for the different issue rates of the various vector modes by guessing what the Advanced SIMD version of an SVE loop would look like and what its issue rate was likely to be. We'd then increase the cost of the SVE loop if the Advanced SIMD loop might issue more quickly. This patch moves that logic to better_main_loop_than_p, so that we can compare loops side-by-side rather than having to guess. This also means we can apply the issue rate heuristics to any vector loop comparison, rather than just weighting SVE vs. Advanced SIMD. The actual heuristics are otherwise unchanged. We're just applying them in a different place. gcc/ * config/aarch64/aarch64.c (aarch64_vector_costs::m_saw_sve_only_op) (aarch64_sve_only_stmt_p): Delete. (aarch64_vector_costs::prefer_unrolled_loop): New function, extracted from adjust_body_cost. (aarch64_vector_costs::better_main_loop_than_p): New function, using heuristics extracted from adjust_body_cost and adjust_body_cost_sve. (aarch64_vector_costs::adjust_body_cost_sve): Remove advsimd_cycles_per_iter and could_use_advsimd parameters. Update after changes above. (aarch64_vector_costs::adjust_body_cost): Update after changes above.	2021-11-12 17:33:03 +00:00
Richard Sandiford	2e1886ea06	aarch64: Add vf_factor to aarch64_vec_op_count -mtune=neoverse-512tvb sets the likely SVE vector length to 128 bits, but it also takes into account Neoverse V1, which is a 256-bit target. This patch adds this VF (VL) factor to aarch64_vec_op_count. gcc/ * config/aarch64/aarch64.c (aarch64_vec_op_count::m_vf_factor): New member variable. (aarch64_vec_op_count::aarch64_vec_op_count): Add a parameter for it. (aarch64_vec_op_count::vf_factor): New function. (aarch64_vector_costs::aarch64_vector_costs): When costing for neoverse-512tvb, pass a vf_factor of 2 for the Neoverse V1 version of an SVE loop. (aarch64_vector_costs::adjust_body_cost): Read the vf factor instead of hard-coding 2.	2021-11-12 17:33:02 +00:00
Richard Sandiford	a82ffd4361	aarch64: Move cycle estimation into aarch64_vec_op_count This patch just moves the main cycle estimation routines into aarch64_vec_op_count. gcc/ * config/aarch64/aarch64.c (aarch64_vec_op_count::rename_cycles_per_iter): New function. (aarch64_vec_op_count::min_nonpred_cycles_per_iter): Likewise. (aarch64_vec_op_count::min_pred_cycles_per_iter): Likewise. (aarch64_vec_op_count::min_cycles_per_iter): Likewise. (aarch64_vec_op_count::dump): Move earlier in file. Dump the above properties too. (aarch64_estimate_min_cycles_per_iter): Delete. (adjust_body_cost): Use aarch64_vec_op_count::min_cycles_per_iter instead of aarch64_estimate_min_cycles_per_iter. Rely on the dump routine to print CPI estimates. (adjust_body_cost_sve): Likewise. Use the other functions above instead of doing the work inline.	2021-11-12 17:33:02 +00:00
Richard Sandiford	1a5288fe3d	aarch64: Use an array of aarch64_vec_op_counts -mtune=neoverse-512tvb uses two issue rates, one for Neoverse V1 and one with more generic parameters. We use both rates when making a choice between scalar, Advanced SIMD and SVE code. Previously we calculated the Neoverse V1 issue rates from the more generic issue rates, but by removing m_scalar_ops and (later) m_advsimd_ops, it becomes easier to track multiple issue rates directly. This patch therefore converts m_ops and (temporarily) m_advsimd_ops into arrays. gcc/ * config/aarch64/aarch64.c (aarch64_vec_op_count): Allow default initialization. (aarch64_vec_op_count::base_issue_info): Remove handling of null issue_infos. (aarch64_vec_op_count::simd_issue_info): Likewise. (aarch64_vec_op_count::sve_issue_info): Likewise. (aarch64_vector_costs::m_ops): Turn into a vector. (aarch64_vector_costs::m_advsimd_ops): Likewise. (aarch64_vector_costs::aarch64_vector_costs): Add entries to the vectors based on aarch64_tune_params. (aarch64_vector_costs::analyze_loop_vinfo): Update the pred_ops of all entries in m_ops. (aarch64_vector_costs::add_stmt_cost): Call count_ops for all entries in m_ops. (aarch64_estimate_min_cycles_per_iter): Remove issue_info parameter and get the information from the ops instead. (aarch64_vector_costs::adjust_body_cost_sve): Take a aarch64_vec_issue_info instead of a aarch64_vec_op_count. (aarch64_vector_costs::adjust_body_cost): Update call accordingly. Exit earlier if m_ops is empty for either cost structure.	2021-11-12 17:33:02 +00:00
Richard Sandiford	6756706ea6	aarch64: Use real scalar op counts Now that vector finish_costs is passed the associated scalar costs, we can record the scalar issue information while computing the scalar costs, rather than trying to estimate it while computing the vector costs. This simplifies things a little, but the main motivation is to improve accuracy. gcc/ * config/aarch64/aarch64.c (aarch64_vector_costs::m_scalar_ops) (aarch64_vector_costs::m_sve_ops): Replace with... (aarch64_vector_costs::m_ops): ...this. (aarch64_vector_costs::analyze_loop_vinfo): Update accordingly. (aarch64_vector_costs::adjust_body_cost_sve): Likewise. (aarch64_vector_costs::aarch64_vector_costs): Likewise. Initialize m_vec_flags here rather than in add_stmt_cost. (aarch64_vector_costs::count_ops): Test for scalar reductions too. Allow vectype to be null. (aarch64_vector_costs::add_stmt_cost): Call count_ops for scalar code too. Don't require vectype to be nonnull. (aarch64_vector_costs::adjust_body_cost): Take the loop_vec_info and scalar costs as parameters. Use the scalar costs to determine the cycles per iteration of the scalar loop, then multiply it by the estimated VF. (aarch64_vector_costs::finish_cost): Update call accordingly.	2021-11-12 17:33:01 +00:00
Richard Sandiford	902b7c9e18	aarch64: Get floatness from stmt_info This patch gets the floatness of a memory access from the data reference rather than the vectype. This makes it more suitable for use in scalar costing code. gcc/ * config/aarch64/aarch64.c (aarch64_dr_type): New function. (aarch64_vector_costs::count_ops): Use it rather than the vectype to determine floatness.	2021-11-12 17:33:01 +00:00
Richard Sandiford	26122469df	aarch64: Remove vectype from latency tests This patch gets the scalar mode of a reduction operation from the gimple stmt rather than the vectype. This makes it more suitable for use in scalar costs. gcc/ * config/aarch64/aarch64.c (aarch64_sve_in_loop_reduction_latency): Remove vectype parameter and get floatness from the type of the stmt lhs instead. (arch64_in_loop_reduction_latency): Likewise. (aarch64_detect_vector_stmt_subtype): Update caller. (aarch64_vector_costs::count_ops): Likewise.	2021-11-12 17:33:00 +00:00
Richard Sandiford	15aba5a67c	aarch64: Fold aarch64_sve_op_count into aarch64_vec_op_count Later patches make aarch64 use the new vector hooks. We then only need to track one set of ops for each aarch64_vector_costs structure. This in turn means that it's more convenient to merge aarch64_sve_op_count and aarch64_vec_op_count. The patch also adds issue info and vec flags to aarch64_vec_op_count, so that the structure is more self-descriptive. This simplifies some things later. gcc/ * config/aarch64/aarch64.c (aarch64_sve_op_count): Fold into... (aarch64_vec_op_count): ...this. Add a constructor. (aarch64_vec_op_count::vec_flags): New function. (aarch64_vec_op_count::base_issue_info): Likewise. (aarch64_vec_op_count::simd_issue_info): Likewise. (aarch64_vec_op_count::sve_issue_info): Likewise. (aarch64_vec_op_count::m_issue_info): New member variable. (aarch64_vec_op_count::m_vec_flags): Likewise. (aarch64_vector_costs): Add a constructor. (aarch64_vector_costs::m_sve_ops): Change type to aarch64_vec_op_count. (aarch64_vector_costs::aarch64_vector_costs): New function. Initialize m_scalar_ops, m_advsimd_ops and m_sve_ops. (aarch64_vector_costs::count_ops): Remove vec_flags and issue_info parameters, using the new aarch64_vec_op_count functions instead. (aarch64_vector_costs::add_stmt_cost): Update call accordingly. (aarch64_sve_op_count::dump): Fold into... (aarch64_vec_op_count::dump): ..here.	2021-11-12 17:33:00 +00:00
Richard Sandiford	526e1639aa	aarch64: Detect more consecutive MEMs For tests like: int res[2]; void f1 (int x, int y) { res[0] = res[1] = x + y; } we generated: add w0, w0, w1 adrp x1, .LANCHOR0 add x2, x1, :lo12:.LANCHOR0 str w0, [x1, #:lo12:.LANCHOR0] str w0, [x2, 4] ret Using [x1, #:lo12:.LANCHOR0] for the first store prevented the two stores being recognised as a pair. However, the MEM_EXPR and MEM_OFFSET information tell us that the MEMs really are consecutive. The peehole2 context then guarantees that the first address is equivalent to [x2, 0]. While there: the reg_mentioned_p tests for loads were probably correct, but seemed a bit indirect. We're matching two consecutive loads, so the thing we need to test is that the second MEM in the original sequence doesn't depend on the result of the first load in the original sequence. gcc/ * config/aarch64/aarch64.c: Include tree-dfa.h. (aarch64_check_consecutive_mems): New function that takes MEM_EXPR and MEM_OFFSET into account. (aarch64_swap_ldrstr_operands): Use it. (aarch64_operands_ok_for_ldpstp): Likewise. Check that the address of the second memory doesn't depend on the result of the first load. gcc/testsuite/ * gcc.target/aarch64/stp_1.c: New test.	2021-11-12 17:33:00 +00:00
Tobias Burnus	48c6cac9ca	Fortran/openmp: Fix '!$omp end' gcc/fortran/ChangeLog: * parse.c (decode_omp_directive): Fix permitting 'nowait' for some combined directives, add missing 'omp end ... loop'. (gfc_ascii_statement): Fix ST_OMP_END_TEAMS_LOOP result. * openmp.c (resolve_omp_clauses): Add missing combined loop constructs case values to the 'if(directive-name: ...)' check. * trans-openmp.c (gfc_split_omp_clauses): Put nowait on target if first leaf construct accepting it. gcc/testsuite/ChangeLog: * gfortran.dg/gomp/unexpected-end.f90: Update dg-error. * gfortran.dg/gomp/clauses-1.f90: New test. * gfortran.dg/gomp/nowait-2.f90: New test. * gfortran.dg/gomp/nowait-3.f90: New test.	2021-11-12 17:58:21 +01:00
Jan Hubicka	82de09ab17	Fix exit condition in ipa_make_function_pure gcc/ChangeLog: * ipa-pure-const.c (ipa_make_function_pure): Fix exit condition.	2021-11-12 16:54:29 +01:00
Jan Hubicka	4526ec20f1	Fix ICE in tree-ssa-structalias.c PR tree-optimization/103175 * ipa-modref.c (modref_lattice::merge): Add sanity check. (callee_to_caller_flags): Make flags adjustment sane. (modref_eaf_analysis::analyze_ssa_name): Likewise.	2021-11-12 16:35:01 +01:00
Jakub Jelinek	f49c7a4fb2	libgomp: Unbreak gcn offload build My recent libgomp change apparently broke libgomp build for gcn offloading. The problem is that gcn, unlike nvptx, doesn't override teams.c source file and the patch I've committed assumed all the non-LIBGOMP_USE_PTHREADS targets do not use it. My understanding is that gcn included omp_get_num_teams and omp_get_team_num definitions in both icv-device.o and teams.o, with the definitions only in the former working correctly. This patch brings gcn into sync with how nvptx does it, that teams.c is overridden, provides a dummy GOMP_teams_reg and omp_get_{num_teams,team_num} definitions and icv-device.c doesn't provide those. 2021-11-12 Jakub Jelinek <jakub@redhat.com> PR target/103201 * config/gcn/icv-device.c (omp_get_num_teams, omp_get_team_num): Move to ... * config/gcn/teams.c: ... here. New file.	2021-11-12 16:11:02 +01:00
Martin Jambor	847f587dc4	Fortran: Use build_debug_expr_decl to create DEBUG_DECL_EXPRs This patch converts one more open coded construction of a DEBUG_EXPR_DECL to a call of build_debug_expr_decl that I missed in my previous patch befause it happens to be in the Fortran front-end. gcc/fortran/ChangeLog: 2021-11-11 Martin Jambor <mjambor@suse.cz> * trans-types.c (gfc_get_array_descr_info): Use build_debug_expr_decl instead of building DEBUG_EXPR_DECL manually.	2021-11-12 15:46:05 +01:00
Martin Liska	6849c71c06	testsuite: Filter out TSVC test on Power [PR103051] PR testsuite/103051 gcc/testsuite/ChangeLog: * gcc.dg/vect/tsvc/vect-tsvc-s112.c: Skip test for old Power CPUs.	2021-11-12 15:24:01 +01:00
Martin Liska	83310a08a2	libbacktrace: fix UBSAN issues Fix issues mentioned in the PR. PR libbacktrace/103167 libbacktrace/ChangeLog: * elf.c (elf_uncompress_lzma_block): Cast to unsigned int. (elf_uncompress_lzma): Likewise. * xztest.c (test_samples): memcpy only if v > 0.	2021-11-12 15:06:12 +01:00
David Malcolm	aa1fd30df5	jit: fix -Werror=format-overflow= in testsuite [PR103199] gcc/jit/ChangeLog: PR jit/103199 * docs/examples/tut04-toyvm/toyvm.c (toyvm_function_compile): Increase size of buffer. * docs/examples/tut04-toyvm/toyvm.cc (compilation_state::create_function): Likewise. Signed-off-by: David Malcolm <dmalcolm@redhat.com>	2021-11-12 08:20:45 -05:00
Jan Hubicka	1b62cddcf0	Fix ipa-modref pure/const discovery PR ipa/103200 * ipa-modref.c (analyze_function, modref_propagate_in_scc): Do not mark pure/const function if there are side-effects.	2021-11-12 14:01:17 +01:00
Chung-Lin Tang	b7e2048063	openmp: Relax handling of implicit map vs. existing device mappings This patch implements relaxing the requirements when a map with the implicit attribute encounters an overlapping existing map. As the OpenMP 5.0 spec describes on page 320, lines 18-27 (and 5.1 spec, page 352, lines 13-22): "If a single contiguous part of the original storage of a list item with an implicit data-mapping attribute has corresponding storage in the device data environment prior to a task encountering the construct that is associated with the map clause, only that part of the original storage will have corresponding storage in the device data environment as a result of the map clause." 2021-11-12 Chung-Lin Tang <cltang@codesourcery.com> include/ChangeLog: * gomp-constants.h (GOMP_MAP_FLAG_SPECIAL_3): Define special bit macro. (GOMP_MAP_IMPLICIT): New special map kind bits value. (GOMP_MAP_FLAG_SPECIAL_BITS): Define helper mask for whole set of special map kind bits. (GOMP_MAP_IMPLICIT_P): New predicate macro for implicit map kinds. gcc/ChangeLog: * tree.h (OMP_CLAUSE_MAP_RUNTIME_IMPLICIT_P): New access macro for 'implicit' bit, using 'base.deprecated_flag' field of tree_node. * tree-pretty-print.c (dump_omp_clause): Add support for printing implicit attribute in tree dumping. * gimplify.c (gimplify_adjust_omp_clauses_1): Set OMP_CLAUSE_MAP_RUNTIME_IMPLICIT_P to 1 if map clause is implicitly created. (gimplify_adjust_omp_clauses): Adjust place of adding implicitly created clauses, from simple append, to starting of list, after non-map clauses. * omp-low.c (lower_omp_target): Add GOMP_MAP_IMPLICIT bits into kind values passed to libgomp for implicit maps. gcc/testsuite/ChangeLog: * c-c++-common/gomp/target-implicit-map-1.c: New test. * c-c++-common/goacc/combined-reduction.c: Adjust scan test pattern. * c-c++-common/goacc/firstprivate-mappings-1.c: Likewise. * c-c++-common/goacc/mdc-1.c: Likewise. * g++.dg/goacc/firstprivate-mappings-1.C: Likewise. libgomp/ChangeLog: * target.c (gomp_map_vars_existing): Add 'bool implicit' parameter, add implicit map handling to allow a "superset" existing map as valid case. (get_kind): Adjust to filter out GOMP_MAP_IMPLICIT bits in return value. (get_implicit): New function to extract implicit status. (gomp_map_fields_existing): Adjust arguments in calls to gomp_map_vars_existing, and add uses of get_implicit. (gomp_map_vars_internal): Likewise. * testsuite/libgomp.c-c++-common/target-implicit-map-1.c: New test.	2021-11-12 20:29:48 +08:00
Jonathan Wakely	a54ce8865a	libstdc++: Print assertion messages to stderr [PR59675] This replaces the printf used by failed debug assertions with fprintf, so we can write to stderr. To avoid including <stdio.h> the assert function is moved into the library. To avoid programs using a vague linkage definition of the old inline function, the function is renamed. Code compiled with old versions of GCC might still call the old function, but code compiled with the newer GCC will call the new function and write to stderr. libstdc++-v3/ChangeLog: PR libstdc++/59675 * acinclude.m4 (libtool_VERSION): Bump version. * config/abi/pre/gnu.ver (GLIBCXX_3.4.30): Add version and export new symbol. * configure: Regenerate. * include/bits/c++config (__replacement_assert): Remove, declare __glibcxx_assert_fail instead. * src/c++11/debug.cc (__glibcxx_assert_fail): New function to replace __replacement_assert, writing to stderr instead of stdout. * testsuite/util/testsuite_abi.cc: Update latest version.	2021-11-12 12:23:10 +00:00
Mikael Morin	68d62cb206	fortran: Ignore unused args in scalarization [PR97896] The KIND argument of the INDEX intrinsic is a compile time constant that is used at compile time only to resolve to a kind-specific library function. That argument is otherwise completely ignored at runtime, and there is no code generated for it as the library procedure has no kind argument. This confuses the scalarizer which expects to see every argument of elemental functions used when calling a procedure. This change removes the argument from the scalarization lists at the beginning of the scalarization process, so that the argument is completely ignored. This also reverts the existing workaround (commit `d09847357b` except for its testcase). PR fortran/97896 gcc/fortran/ChangeLog: * intrinsic.c (add_sym_4ind): Remove. (add_functions): Use add_sym4 instead of add_sym4ind. Don’t special case the index intrinsic. * iresolve.c (gfc_resolve_index_func): Use the individual arguments directly instead of the full argument list. * intrinsic.h (gfc_resolve_index_func): Update the declaration accordingly. * trans-decl.c (gfc_get_extern_function_decl): Don’t modify the list of arguments in the case of the index intrinsic. * trans-array.h (gfc_get_intrinsic_for_expr, gfc_get_proc_ifc_for_expr): New. * trans-array.c (gfc_get_intrinsic_for_expr, arg_evaluated_for_scalarization): New. (gfc_walk_elemental_function_args): Add intrinsic procedure as argument. Count arguments. Check arg_evaluated_for_scalarization. * trans-intrinsic.c (gfc_walk_intrinsic_function): Update call. * trans-stmt.c (get_intrinsic_for_code): New. (gfc_trans_call): Update call. gcc/testsuite/ChangeLog: * gfortran.dg/index_5.f90: New.	2021-11-12 13:10:55 +01:00
Jakub Jelinek	7d6da11fce	openmp: Honor OpenMP 5.1 num_teams lower bound The following patch implements what I've been talking about earlier, honor that for explicit num_teams clause we create at least the lower-bound (if not specified, upper-bound) teams in the league. For host fallback, it still means we only have one thread doing all the teams, sequentially one after another. For PTX and GCN, I think the new teams-2.c test and maybe teams-4.c too will or might fail. For these offloads, I think it is ok to remove symbols no longer used from libgomp.a. If num_teams_lower is bigger than the provided num_blocks or num_workgroups, we should arrange for gomp_num_teams_var to be num_teams_lower - 1, stop using the %ctaid.x or __builtin_gcn_dim_pos (0) for omp_get_team_num () and instead use for it some .shared var that GOMP_teams4 initializes to %ctaid.x or __builtin_gcn_dim_pos (0) when first and for !first increment that by num_blocks or num_workgroups each time and only return false when we are above num_teams_lower. Any help with actually implementing this for the 2 architectures highly appreciated. 2021-11-12 Jakub Jelinek <jakub@redhat.com> gcc/ * omp-builtins.def (BUILT_IN_GOMP_TEAMS): Remove. (BUILT_IN_GOMP_TEAMS4): New. * builtin-types.def (BT_FN_VOID_UINT_UINT): Remove. (BT_FN_BOOL_UINT_UINT_UINT_BOOL): New. * omp-low.c (lower_omp_teams): Use GOMP_teams4 instead of GOMP_teams, pass to it also num_teams lower-bound expression or a dup of upper-bound if it is missing and a flag whether it is the first call or not. gcc/fortran/ * types.def (BT_FN_VOID_UINT_UINT): Remove. (BT_FN_BOOL_UINT_UINT_UINT_BOOL): New. libgomp/ * libgomp_g.h (GOMP_teams4): Declare. * libgomp.map (GOMP_5.1): Export GOMP_teams4. * target.c (GOMP_teams4): New function. * config/nvptx/target.c (GOMP_teams): Remove. (GOMP_teams4): New function. * config/gcn/target.c (GOMP_teams): Remove. (GOMP_teams4): New function. * testsuite/libgomp.c/teams-4.c (main): Expect exactly 2 teams instead of <= 2. * testsuite/libgomp.c-c++-common/teams-2.c: New test.	2021-11-12 12:41:22 +01:00
Martin Liska	5f516a6a5d	Remove unused function. PR tree-optimization/102497 gcc/ChangeLog: * gimple-predicate-analysis.cc (add_pred): Remove unused function:	2021-11-12 12:40:02 +01:00
Richard Biener	140346fa24	tree-optimization/103204 - fix missed valueization in VN The following fixes a missed valueization when simplifying a MEM[&...] combination during valueization. 2021-11-12 Richard Biener <rguenther@suse.de> PR tree-optimization/103204 * tree-ssa-sccvn.c (valueize_refs_1): Re-valueize the top operand after folding in an address. * gcc.dg/torture/pr103204.c: New testcase.	2021-11-12 09:11:49 +01:00
Alan Modra	c60ded6f5e	Make opcodes configure depend on bfd configure The idea is for opcodes to be able to see whether bfd is compiled for 64-bit. A lot of --enable-targets=all libopcodes is wasted space if bfd can't load 64-bit target object files. * Makefile.def (configure-opcodes): Depend on configure-bfd. * Makefile.in: Regenerate.	2021-11-12 18:34:12 +10:30
Jonathan Wakely	1ae8edf5f7	libstdc++: Implement constexpr std::vector for C++20 This implements P1004R2 ("Making std::vector constexpr") for C++20. For now, debug mode vectors are not supported in constant expressions. To make that work we might need to disable all attaching/detaching of safe iterators. That can be fixed later. Co-authored-by: Josh Marshall <joshua.r.marshall.1991@gmail.com> libstdc++-v3/ChangeLog: * include/bits/alloc_traits.h (_Destroy): Make constexpr for C++20 mode. * include/bits/allocator.h (__shrink_to_fit::_S_do_it): Likewise. * include/bits/stl_algobase.h (__fill_a1): Declare _Bit_iterator overload constexpr for C++20. * include/bits/stl_bvector.h (_Bit_type, _S_word_bit): Move out of inline namespace. (_Bit_reference, _Bit_iterator_base, _Bit_iterator) (_Bit_const_iterator, _Bvector_impl_data, _Bvector_base) (vector<bool, A>>): Add constexpr to every member function. (_Bvector_base::_M_allocate): Initialize storage during constant evaluation. (vector<bool, A>::_M_initialize_value): Use __fill_bvector_n instead of memset. (__fill_bvector_n): New helper function to replace memset during constant evaluation. * include/bits/stl_uninitialized.h (__uninitialized_copy<false>): Move logic to ... (__do_uninit_copy): New function. (__uninitialized_fill<false>): Move logic to ... (__do_uninit_fill): New function. (__uninitialized_fill_n<false>): Move logic to ... (__do_uninit_fill_n): New function. (__uninitialized_copy_a): Add constexpr. Use __do_uninit_copy. (__uninitialized_move_a, __uninitialized_move_if_noexcept_a): Add constexpr. (__uninitialized_fill_a): Add constexpr. Use __do_uninit_fill. (__uninitialized_fill_n_a): Add constexpr. Use __do_uninit_fill_n. (__uninitialized_default_n, __uninitialized_default_n_a) (__relocate_a_1, __relocate_a): Add constexpr. * include/bits/stl_vector.h (_Vector_impl_data, _Vector_impl) (_Vector_base, vector): Add constexpr to every member function. (_Vector_impl::_S_adjust): Disable ASan annotation during constant evaluation. (_Vector_base::_S_use_relocate): Disable bitwise-relocation during constant evaluation. (vector::_Temporary_value): Use a union for storage. * include/bits/vector.tcc (vector, vector<bool>): Add constexpr to every member function. * include/std/vector (erase_if, erase): Add constexpr. * testsuite/23_containers/headers/vector/synopsis.cc: Add constexpr for C++20 mode. * testsuite/23_containers/vector/bool/cmp_c++20.cc: Change to compile-only test using constant expressions. * testsuite/23_containers/vector/bool/capacity/29134.cc: Adjust namespace for _S_word_bit. * testsuite/23_containers/vector/bool/modifiers/insert/31370.cc: Likewise. * testsuite/23_containers/vector/cmp_c++20.cc: Likewise. * testsuite/23_containers/vector/cons/89164.cc: Adjust errors for C++20 and move C++17 test to ... * testsuite/23_containers/vector/cons/89164_c++17.cc: ... here. * testsuite/23_containers/vector/bool/capacity/constexpr.cc: New test. * testsuite/23_containers/vector/bool/cons/constexpr.cc: New test. * testsuite/23_containers/vector/bool/element_access/constexpr.cc: New test. * testsuite/23_containers/vector/bool/modifiers/assign/constexpr.cc: New test. * testsuite/23_containers/vector/bool/modifiers/constexpr.cc: New test. * testsuite/23_containers/vector/bool/modifiers/swap/constexpr.cc: New test. * testsuite/23_containers/vector/capacity/constexpr.cc: New test. * testsuite/23_containers/vector/cons/constexpr.cc: New test. * testsuite/23_containers/vector/data_access/constexpr.cc: New test. * testsuite/23_containers/vector/element_access/constexpr.cc: New test. * testsuite/23_containers/vector/modifiers/assign/constexpr.cc: New test. * testsuite/23_containers/vector/modifiers/constexpr.cc: New test. * testsuite/23_containers/vector/modifiers/swap/constexpr.cc: New test.	2021-11-12 00:42:39 +00:00
GCC Administrator	b39265d4fe	Daily bump.	2021-11-12 00:16:32 +00:00
Jonathan Wakely	4a407d358e	libstdc++: Fix debug containers for C++98 mode Since r12-5072 made _Safe_container::operator=(const _Safe_container&) protected, the debug containers no longer compile in C++98 mode. They have user-provided copy assignment operators in C++98 mode, and they assign each base class in turn. The 'this->_M_safe() = __x' expressions fail, because calling a protected member function is only allowed via 'this'. They could be fixed by using this->_Safe::operator=(__x) but a simpler solution is to just remove the user-provided assignment operators and let the compiler define them (as we do for C++11 and later, by defining them as defaulted). The only change needed for that to work is to define the _Safe_vector copy assignment operator in C++98 mode, so that the implicit __gnu_debug::vector::operator= definition will call it, instead of needing to call _M_update_guaranteed_capacity() manually. libstdc++-v3/ChangeLog: * include/debug/deque (deque::operator=(const deque&)): Remove definition. * include/debug/list (list::operator=(const list&)): Likewise. * include/debug/map.h (map::operator=(const map&)): Likewise. * include/debug/multimap.h (multimap::operator=(const multimap&)): Likewise. * include/debug/multiset.h (multiset::operator=(const multiset&)): Likewise. * include/debug/set.h (set::operator=(const set&)): Likewise. * include/debug/string (basic_string::operator=(const basic_string&)): Likewise. * include/debug/vector (vector::operator=(const vector&)): Likewise. (_Safe_vector::operator=(const _Safe_vector&)): Define for C++98 as well.	2021-11-11 21:55:11 +00:00
Aldy Hernandez	53b3edceab	Make ranger optional in path_range_query. All users of path_range_query are currently allocating a gimple_ranger only to pass it to the query object. It's tidier to just do it from path_range_query if no ranger was passed. Tested on x86-64 Linux. gcc/ChangeLog: * gimple-range-path.cc (path_range_query::path_range_query): New ctor without a ranger. (path_range_query::~path_range_query): Free ranger if necessary. (path_range_query::range_on_path_entry): Adjust m_ranger for pointer. (path_range_query::ssa_range_in_phi): Same. (path_range_query::compute_ranges_in_block): Same. (path_range_query::compute_imports): Same. (path_range_query::compute_ranges): Same. (path_range_query::range_of_stmt): Same. (path_range_query::compute_outgoing_relations): Same. * gimple-range-path.h (class path_range_query): New ctor. * tree-ssa-loop-ch.c (ch_base::copy_headers): Remove gimple_ranger as path_range_query allocates one. * tree-ssa-threadbackward.c (class back_threader): Remove m_ranger. (back_threader::~back_threader): Same.	2021-11-11 22:13:17 +01:00
Aldy Hernandez	a7753db4a7	Remove loop crossing restriction from the backward threader. We have much more thorough restrictions, that are shared between both threader implementations, in the registry. I've been meaning to remove the backward threader one, since it's only purpose was reducing the search space. Previously there was a small time penalty for its removal, but with the various patches in the past month, it looks like the removal is a wash performance wise. This catches 8 more jump threads in the backward threader in my suite. Presumably, because we disallowed all loop crossing, whereas the registry restrictions allow some crossing (if we exit the loop, etc). Tested on x86-64 Linux. gcc/ChangeLog: * tree-ssa-threadbackward.c (back_threader_profitability::profitable_path_p): Remove loop crossing restriction.	2021-11-11 22:13:17 +01:00
Bill Schmidt	8a8458ac6b	rs6000: Fix test_mffsl.c to require Power9 support 2021-11-11 Bill Schmidt <wschmidt@linux.ibm.com> gcc/testsuite/ * gcc.target/powerpc/test_mffsl.c: Require Power9.	2021-11-11 14:36:04 -06:00
Ian Lance Taylor	7846156274	compiler: traverse func subexprs when creating func descriptors Fix the Create_func_descriptors pass to traverse the subexpressions of the function in a Call_expression. There are no subexpressions in the normal case of calling a function a method directly, but there are subexpressions when in code like F().M() when F returns an interface type. Forgetting to traverse the function subexpressions was almost entirely hidden by the fact that we also created the necessary thunks in Bound_method_expression::do_flatten and Interface_field_reference_expression::do_get_backend. However, when the thunks were created there, they did not go through the order_evaluations pass. This almost always worked, but failed in the case in which the function being thunked returned multiple results, as order_evaluations takes the necessary step of moving the Call_expression into its own statement, and that would not happen when order_evaluations was not called. Avoid hiding errors like this by changing those methods to only lookup the previously created thunk, rather than creating it if it was not already created. The test case for this is https://golang.org/cl/363156. Fixes https://golang.org/issue/49512 Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/363274	2021-11-11 12:21:56 -08:00
Jonathan Wakely	083fd73202	libstdc++: Make pmr::memory_resource::allocate implicitly create objects Calling the placement version of ::operator new "implicitly creates objects in the returned region of storage" as per [intro.object]. This allows the returned memory to be used as storage for implicit-lifetime types (including arrays) without additional action by the caller. This is required by the proposed resolution of LWG 3147. libstdc++-v3/ChangeLog: * include/std/memory_resource (memory_resource::allocate): Implicitly create objects in the returned storage.	2021-11-11 18:16:17 +00:00
Jonathan Wakely	ef0e100f58	libstdc++: Remove public std::vector<bool>::data() member This function only exists to avoid an error in the debug mode vector, so doesn't need to be public. libstdc++-v3/ChangeLog: * include/bits/stl_bvector.h (vector<bool>::data()): Give protected access, and delete for C++11 and later.	2021-11-11 18:16:17 +00:00
Jan Hubicka	dc002e31fb	Fix gfortran.dg/inline_matmul_17.f90 template. As discussed on the mailing list the template actually tests for missed optimization where we fail to pragate size of an array. We no longer miss this after modref improvements. gcc/testsuite/ChangeLog: 2021-11-11 Jan Hubicka <hubicka@ucw.cz> * gfortran.dg/inline_matmul_17.f90: Fix template	2021-11-11 18:51:35 +01:00
Jan Hubicka	494bdadf28	Enable pure-const discovery in modref. We newly can handle some extra cases, for example: struct a {int a,b,c;}; __attribute__ ((noinline)) int init (struct a a) { a->a=1; a->b=2; a->c=3; } int const_fn () { struct a a; init (&a); return a.a + a.b + a.c; } Here pure/const stops on the fact that const_fn calls non-const init, while modref knows that the memory it initializes is local to const_fn. I ended up reordering passes so early modref is done after early pure-const mostly to avoid need to change testsuite which greps for const functions being detects in pure-const. Stil some testuiste compensation is needed. gcc/ChangeLog: 2021-11-11 Jan Hubicka <hubicka@ucw.cz> ipa-modref.c (analyze_function): Do pure/const discovery, return true on success. (pass_modref::execute): If pure/const is discovered fixup cfg. (ignore_edge): Do not ignore pure/const edges. (modref_propagate_in_scc): Do pure/const discovery, return true if cdtor was promoted pure/const. (pass_ipa_modref::execute): If needed remove unreachable functions. * ipa-pure-const.c (warn_function_noreturn): Fix whitespace. (warn_function_cold): Likewise. (skip_function_for_local_pure_const): Move earlier. (ipa_make_function_const): Break out from ... (ipa_make_function_pure): Break out from ... (propagate_pure_const): ... here. (pass_local_pure_const::execute): Use it. * ipa-utils.h (ipa_make_function_const): Declare. (ipa_make_function_pure): Declare. * passes.def: Move early modref after pure-const. gcc/testsuite/ChangeLog: 2021-11-11 Jan Hubicka <hubicka@ucw.cz> * c-c++-common/tm/inline-asm.c: Disable pure-const. * g++.dg/ipa/modref-1.C: Update template. * gcc.dg/tree-ssa/modref-11.c: Disable pure-const. * gcc.dg/tree-ssa/modref-14.c: New test. * gcc.dg/tree-ssa/modref-8.c: Do not optimize sibling calls. * gfortran.dg/do_subscript_3.f90: Add -O0.	2021-11-11 18:14:45 +01:00
David Malcolm	abdff441a0	diagnostic: fix unused variable 'def_tabstop' [PR103129] gcc/ChangeLog: PR other/103129 * diagnostic-show-locus.c (def_policy): Use def_tabstop. Signed-off-by: David Malcolm <dmalcolm@redhat.com>	2021-11-11 12:12:53 -05:00
Tobias Burnus	407eaad25f	Fortran/openmp: Add support for 2 argument num_teams clause Fortran part to commit r12-5146-g48d7327f2aaf65 gcc/fortran/ChangeLog: * gfortran.h (struct gfc_omp_clauses): Rename num_teams to num_teams_upper, add num_teams_upper. * dump-parse-tree.c (show_omp_clauses): Update to handle lower-bound num_teams clause. * frontend-passes.c (gfc_code_walker): Likewise * openmp.c (gfc_free_omp_clauses, gfc_match_omp_clauses, resolve_omp_clauses): Likewise. * trans-openmp.c (gfc_trans_omp_clauses, gfc_split_omp_clauses, gfc_trans_omp_target): Likewise. libgomp/ChangeLog: * testsuite/libgomp.fortran/teams-1.f90: New test.	2021-11-11 17:27:00 +01:00
Jonathan Wright	e1b218d174	aarch64: Use type-qualified builtins for vcombine_* Neon intrinsics Declare unsigned and polynomial type-qualified builtins for vcombine_* Neon intrinsics. Using these builtins removes the need for many casts in arm_neon.h. gcc/ChangeLog: 2021-11-10 Jonathan Wright <jonathan.wright@arm.com> * config/aarch64/aarch64-builtins.c (TYPES_COMBINE): Delete. (TYPES_COMBINEP): Delete. * config/aarch64/aarch64-simd-builtins.def: Declare type- qualified builtins for vcombine_* intrinsics. * config/aarch64/arm_neon.h (vcombine_s8): Remove unnecessary cast. (vcombine_s16): Likewise. (vcombine_s32): Likewise. (vcombine_f32): Likewise. (vcombine_u8): Use type-qualified builtin and remove casts. (vcombine_u16): Likewise. (vcombine_u32): Likewise. (vcombine_u64): Likewise. (vcombine_p8): Likewise. (vcombine_p16): Likewise. (vcombine_p64): Likewise. (vcombine_bf16): Remove unnecessary cast. * config/aarch64/iterators.md (VD_I): New mode iterator. (VDC_P): New mode iterator.	2021-11-11 15:34:52 +00:00
Jonathan Wright	1716ddd1e9	aarch64: Use type-qualified builtins for LD1/ST1 Neon intrinsics Declare unsigned and polynomial type-qualified builtins for LD1/ST1 Neon intrinsics. Using these builtins removes the need for many casts in arm_neon.h. The new type-qualified builtins are also lowered to gimple - as the unqualified builtins are already. gcc/ChangeLog: 2021-11-10 Jonathan Wright <jonathan.wright@arm.com> * config/aarch64/aarch64-builtins.c (TYPES_LOAD1_U): Define. (TYPES_LOAD1_P): Define. (TYPES_STORE1_U): Define. (TYPES_STORE1P): Rename to... (TYPES_STORE1_P): This. (get_mem_type_for_load_store): Add unsigned and poly types. (aarch64_general_gimple_fold_builtin): Add unsigned and poly type-qualified builtin declarations. * config/aarch64/aarch64-simd-builtins.def: Declare type- qualified builtins for LD1/ST1. * config/aarch64/arm_neon.h (vld1_p8): Use type-qualified builtin and remove cast. (vld1_p16): Likewise. (vld1_u8): Likewise. (vld1_u16): Likewise. (vld1_u32): Likewise. (vld1q_p8): Likewise. (vld1q_p16): Likewise. (vld1q_p64): Likewise. (vld1q_u8): Likewise. (vld1q_u16): Likewise. (vld1q_u32): Likewise. (vld1q_u64): Likewise. (vst1_p8): Likewise. (vst1_p16): Likewise. (vst1_u8): Likewise. (vst1_u16): Likewise. (vst1_u32): Likewise. (vst1q_p8): Likewise. (vst1q_p16): Likewise. (vst1q_p64): Likewise. (vst1q_u8): Likewise. (vst1q_u16): Likewise. (vst1q_u32): Likewise. (vst1q_u64): Likewise. * config/aarch64/iterators.md (VALLP_NO_DI): New iterator.	2021-11-11 15:34:51 +00:00
Jonathan Wright	6eca10aa76	aarch64: Use type-qualified builtins for ADDV Neon intrinsics Declare unsigned type-qualified builtins and use them to implement the vector reduction Neon intrinsics. This removes the need for many casts in arm_neon.h. gcc/ChangeLog: 2021-11-09 Jonathan Wright <jonathan.wright@arm.com> * config/aarch64/aarch64-simd-builtins.def: Declare unsigned builtins for vector reduction. * config/aarch64/arm_neon.h (vaddv_u8): Use type-qualified builtin and remove casts. (vaddv_u16): Likewise. (vaddv_u32): Likewise. (vaddvq_u8): Likewise. (vaddvq_u16): Likewise. (vaddvq_u32): Likewise. (vaddvq_u64): Likewise.	2021-11-11 15:34:51 +00:00
Jonathan Wright	f341c03203	aarch64: Use type-qualified builtins for ADDP Neon intrinsics Declare unsigned type-qualified builtins and use them to implement the pairwise addition Neon intrinsics. This removes the need for many casts in arm_neon.h. gcc/ChangeLog: 2021-11-09 Jonathan Wright <jonathan.wright@arm.com> * config/aarch64/aarch64-simd-builtins.def: * config/aarch64/arm_neon.h (vpaddq_u8): Use type-qualified builtin and remove casts. (vpaddq_u16): Likewise. (vpaddq_u32): Likewise. (vpaddq_u64): Likewise. (vpadd_u8): Likewise. (vpadd_u16): Likewise. (vpadd_u32): Likewise. (vpaddd_u64): Likewise.	2021-11-11 15:34:51 +00:00
Jonathan Wright	80ee260d5b	aarch64: Use type-qualified builtins for [R]SUBHN[2] Neon intrinsics Declare unsigned type-qualified builtins and use them to implement (rounding) halving-narrowing-subtract Neon intrinsics. This removes the need for many casts in arm_neon.h. gcc/ChangeLog: 2021-11-09 Jonathan Wright <jonathan.wright@arm.com> * config/aarch64/aarch64-simd-builtins.def: Declare unsigned builtins for [r]subhn[2]. * config/aarch64/arm_neon.h (vsubhn_s16): Remove unnecessary cast. (vsubhn_s32): Likewise. (vsubhn_s64): Likewise. (vsubhn_u16): Use type-qualified builtin and remove casts. (vsubhn_u32): Likewise. (vsubhn_u64): Likewise. (vrsubhn_s16): Remove unnecessary cast. (vrsubhn_s32): Likewise. (vrsubhn_s64): Likewise. (vrsubhn_u16): Use type-qualified builtin and remove casts. (vrsubhn_u32): Likewise. (vrsubhn_u64): Likewise. (vrsubhn_high_s16): Remove unnecessary cast. (vrsubhn_high_s32): Likewise. (vrsubhn_high_s64): Likewise. (vrsubhn_high_u16): Use type-qualified builtin and remove casts. (vrsubhn_high_u32): Likewise. (vrsubhn_high_u64): Likewise. (vsubhn_high_s16): Remove unnecessary cast. (vsubhn_high_s32): Likewise. (vsubhn_high_s64): Likewise. (vsubhn_high_u16): Use type-qualified builtin and remove casts. (vsubhn_high_u32): Likewise. (vsubhn_high_u64): Likewise.	2021-11-11 15:34:51 +00:00
Jonathan Wright	7bde2a6ecd	aarch64: Use type-qualified builtins for [R]ADDHN[2] Neon intrinsics Declare unsigned type-qualified builtins and use them to implement (rounding) halving-narrowing-add Neon intrinsics. This removes the need for many casts in arm_neon.h. gcc/ChangeLog: 2021-11-09 Jonathan Wright <jonathan.wright@arm.com> * config/aarch64/aarch64-simd-builtins.def: Declare unsigned builtins for [r]addhn[2]. * config/aarch64/arm_neon.h (vaddhn_s16): Remove unnecessary cast. (vaddhn_s32): Likewise. (vaddhn_s64): Likewise. (vaddhn_u16): Use type-qualified builtin and remove casts. (vaddhn_u32): Likewise. (vaddhn_u64): Likewise. (vraddhn_s16): Remove unnecessary cast. (vraddhn_s32): Likewise. (vraddhn_s64): Likewise. (vraddhn_u16): Use type-qualified builtin and remove casts. (vraddhn_u32): Likewise. (vraddhn_u64): Likewise. (vaddhn_high_s16): Remove unnecessary cast. (vaddhn_high_s32): Likewise. (vaddhn_high_s64): Likewise. (vaddhn_high_u16): Use type-qualified builtin and remove casts. (vaddhn_high_u32): Likewise. (vaddhn_high_u64): Likewise. (vraddhn_high_s16): Remove unnecessary cast. (vraddhn_high_s32): Likewise. (vraddhn_high_s64): Likewise. (vraddhn_high_u16): Use type-qualified builtin and remove casts. (vraddhn_high_u32): Likewise. (vraddhn_high_u64): Likewise.	2021-11-11 15:34:51 +00:00

1 2 3 4 5 ...

189627 Commits