OpenE2K/gcc - gcc - Expired Mentality Git

Commit Graph

Author	SHA1	Message	Date
liuhongt	84bcefd555	Enable vectorization for _Float16 floor/ceil/trunc/nearbyint/rint operations. gcc/ChangeLog: PR target/102464 * config/i386/i386-builtin-types.def (V8HF_FTYPE_V8HF): New function type. (V16HF_FTYPE_V16HF): Ditto. (V32HF_FTYPE_V32HF): Ditto. (V8HF_FTYPE_V8HF_ROUND): Ditto. (V16HF_FTYPE_V16HF_ROUND): Ditto. (V32HF_FTYPE_V32HF_ROUND): Ditto. * config/i386/i386-builtin.def ( IX86_BUILTIN_FLOORPH, IX86_BUILTIN_CEILPH, IX86_BUILTIN_TRUNCPH, IX86_BUILTIN_FLOORPH256, IX86_BUILTIN_CEILPH256, IX86_BUILTIN_TRUNCPH256, IX86_BUILTIN_FLOORPH512, IX86_BUILTIN_CEILPH512, IX86_BUILTIN_TRUNCPH512): New builtin. * config/i386/i386-builtins.c (ix86_builtin_vectorized_function): Enable vectorization for HFmode FLOOR/CEIL/TRUNC operation. * config/i386/i386-expand.c (ix86_expand_args_builtin): Handle new builtins. * config/i386/sse.md (rint<mode>2, nearbyint<mode>2): Extend to vector HFmodes. gcc/testsuite/ChangeLog: * gcc.target/i386/pr102464-vrndscaleph.c: New test.	2021-10-29 09:45:29 +08:00
GCC Administrator	2322c8b1b4	Daily bump.	2021-10-29 00:16:37 +00:00
Aldy Hernandez	6ef9ad9309	path relation oracle: Remove SSA's being killed from the equivalence list. Same thing as the relational change. Walk any equivalences that have been registered on the path, and remove the name being killed. The only reason we had added the equivalence with itself earlier is so we wouldn't search any further in the equivalency list. So if we are removing all references to it, then we no longer need to add a "kill" record. Will push pending tests on x86-64 Linux. Co-authored-by: Andrew MacLeod <amacleod@redhat.com> gcc/ChangeLog: * value-relation.cc (path_oracle::killing_def): Walk the equivalency list and remove SSA from any equivalencies.	2021-10-28 23:12:03 +02:00
Stafford Horne	308531d148	or1k: Add return address argument to _mcount call This fixes an issue in the glibc port I am working on where the build fails due to the warning: error: calling ‘__builtin_return_address’ with a nonzero argument is unsafe [-Werror=frame-address] This is due to how the current implementation of _mcount in glibc uses __builtin_return_address with a count argument of 1. Fix that by passing the value of LR_REGNUM to the _mcount function, effectivtly providing the value _mcount is after. This is an ABI change, but I think it's OK because the glibc port for or1k is not yet upstreamed. Also, I think just adding an argument should not break anything anyway. gcc/ChangeLog: * config/or1k/or1k.h (PROFILE_HOOK): Add return address argument to _mcount.	2021-10-29 05:31:38 +09:00
Jakub Jelinek	6123b998b1	match.pd: Optimize MIN_EXPR <addr1, addr2> etc. addr1 < addr2 would be simplified [PR102951] This patch outlines the decision whether address comparison can be folded or not from the match.pd simple comparison simplification and uses it both there and in a new minmax simplification, such that we fold e.g. MAX (&a[2], &a[1]) etc. Some of the Wstringop-overflow-62.c changes might look weird, but that seems to be mainly due to gimple_fold_builtin_memset not bothering to copy over location, will fix that incrementally. 2021-10-28 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/102951 * fold-const.h (address_compare): Declare. * fold-const.c (address_compare): New function. * match.pd (cmp (convert1?@2 addr@0) (convert2? addr@1)): Use address_compare helper. (minmax cmp (convert1?@2 addr@0) (convert2?@3 addr@1)): New simplification. * gcc.dg/tree-ssa/pr102951.c: New test. * gcc.dg/Wstringop-overflow-62.c: Adjust expected diagnostics.	2021-10-28 20:10:15 +02:00
Andrew MacLeod	d123daec0c	Fix ifcvt-4.c to not depend on VRP2 asserts. The testcase fails if VRP2 is replaced with a non-assert based VRP because it accidentally depends on specific IL changes when the asserts are removed. This removes that dependency. gcc/testsuite/ * gcc.dg/ifcvt-4.c: Adjust.	2021-10-28 10:48:39 -04:00
Andrew MacLeod	a6bbf1cc9f	Unify EVRP and VRP folding predicate message. EVRP issues a message fior folding predicates in a different format than VRP does, this patch unifies the messaging. gcc/ * vr-values.c (simplify_using_ranges::fold_cond): Change fold message. gcc/testsuite/ * gcc.dg/tree-ssa/evrp9.c: Adjust message scanned for. * gcc.dg/tree-ssa/pr21458-2.c: Ditto.	2021-10-28 10:48:39 -04:00
Andrew MacLeod	d46aeb5906	Reset scev before invoking array_checker. Before invoking the array_checker, we need to reset scev so it will not try to access any ssa_names that the substitute and fold engine has freed. PR tree-optimization/102940 * tree-vrp.c (execute_ranger_vrp): Reset scev.	2021-10-28 10:48:38 -04:00
Patrick Palka	f70f17d036	c++: CTAD within template argument [PR102933] Here when checking for erroneous occurrences of 'auto' inside a template argument (which is allowed by the concepts TS for class templates), extract_autos_r picks up the CTAD placeholder for X{T{0}} which causes check_auto_in_tmpl_args to reject this valid template argument. This patch fixes this by making extract_autos_r ignore CTAD placeholders. However, it seems we don't need to call check_auto_in_tmpl_args at all outside of the concepts TS since using 'auto' as a type-id is otherwise rejected more generally at parse time. So this patch makes the function just exit early if !flag_concepts_ts. Similarly, I think the concepts code paths in do_auto_deduction and type_uses_auto are only necessary for the concepts TS, so this patch also restricts these code paths accordingly. PR c++/102933 gcc/cp/ChangeLog: * parser.c (cp_parser_simple_type_specifier): Adjust diagnostic for using auto in parameter declaration. * pt.c (extract_autos_r): Ignore CTAD placeholders. (extract_autos): Use range-based for. (do_auto_deduction): Use extract_autos only for the concepts TS and not also for standard concepts. (type_uses_auto): Likewise with for_each_template_parm. (check_auto_in_tmpl_args): Just return false outside of the concepts TS. Simplify. gcc/testsuite/ChangeLog: * g++.dg/cpp2a/nontype-class50.C: New test. * g++.dg/cpp2a/nontype-class50a.C: New test.	2021-10-28 10:46:46 -04:00
Richard Purdie	e5ddbbf992	[PATCH 4/5] gcc/nios2: Define the musl linker Add a definition of the musl linker used on the nios2 platform. 2021-10-26 Richard Purdie <richard.purdie@linuxfoundation.org> gcc/ChangeLog: * config/nios2/linux.h (MUSL_DYNAMIC_LINKER): Add musl linker Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>	2021-10-28 10:45:10 -04:00
Richard Purdie	84401ce5fb	[PATCH 1/5] Makefile.in: Ensure build CPP/CPPFLAGS is used for build targets During cross compiling, CPP is being set to the target compiler even for build targets. As an example, when building a cross compiler targetting mingw, the config.log for libiberty in build.x86_64-pokysdk-mingw32.i586-poky-linux/build-x86_64-linux/libiberty/config.log shows: configure:3786: checking how to run the C preprocessor configure:3856: result: x86_64-pokysdk-mingw32-gcc -E --sysroot=[sysroot]/x86_64-nativesdk-mingw32-pokysdk-mingw32 configure:3876: x86_64-pokysdk-mingw32-gcc -E --sysroot=[sysroot]/x86_64-nativesdk-mingw32-pokysdk-mingw32 conftest.c configure:3876: $? = 0 This is libiberty being built for the build environment, not the target one (i.e. in build-x86_64-linux). As such it should be using the build environment's gcc and not the target one. In the mingw case the system headers are quite different leading to build failures related to not being able to include a process.h file for pem-unix.c. Further analysis shows the same issue occuring for CPPFLAGS too. Fix this by adding support for CPP_FOR_BUILD and CPPFLAGS_FOR_BUILD which for example, avoids mixing the mingw headers for host binaries on linux systems. 2021-10-27 Richard Purdie <richard.purdie@linuxfoundation.org> ChangeLog: * Makefile.tpl: Add CPP_FOR_BUILD and CPPFLAGS_FOR_BUILD support * Makefile.in: Regenerate. * configure: Regenerate. * configure.ac: Add CPP_FOR_BUILD and CPPFLAGS_FOR_BUILD support gcc/ChangeLog: * configure: Regenerate. * configure.ac: Use CPPFLAGS_FOR_BUILD for GMPINC Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>	2021-10-28 10:42:49 -04:00
Patrick Palka	9927ecbb42	c++: quadratic constexpr behavior for left-assoc logical exprs [PR102780] In the testcase below the two left fold expressions each expand into a constant logical expression with 1024 terms, for which potential_const_expr takes more than a minute to return true. This happens because p_c_e_1 performs trial evaluation of the first operand of a &&/\|\| in order to determine whether to consider the potentiality of the second operand. And because the expanded expression is left-associated, this trial evaluation causes p_c_e_1 to be quadratic in the number of terms of the expression. This patch fixes this quadratic behavior by making p_c_e_1 preemptively compute potentiality of the second operand of a &&/\|\|, and perform trial evaluation of the first operand only if the second operand isn't potentially constant. We must be careful to avoid emitting bogus diagnostics during the preemptive computation; to that end, we perform this shortcut only when tf_error is cleared, and when tf_error is set we now first check potentiality of the whole expression quietly and replay the check noisily for diagnostics. Apart from fixing the quadraticness for left-associated logical exprs, this change also reduces compile time for the libstdc++ testcase 20_util/variant/87619.cc by about 15% even though our <variant> uses right folds instead of left folds. Likewise for the testcase in the PR, for which compile time is reduced by 30%. The reason for these speedups is that p_c_e_1 no longer performs expensive trial evaluation of each term of large constant logical expressions when determining their potentiality. PR c++/102780 gcc/cp/ChangeLog: * constexpr.c (potential_constant_expression_1) <case TRUTH__EXPR>: When tf_error isn't set, preemptively check potentiality of the second operand before performing trial evaluation of the first operand. (potential_constant_expression_1): When tf_error is set, first check potentiality quietly and return true if successful, otherwise proceed noisily to give errors. gcc/testsuite/ChangeLog: g++.dg/cpp1z/fold13.C: New test.	2021-10-28 10:05:14 -04:00
Eric Botcazou	60861d8794	Update documentation of %X spec %X Output the accumulated linker options specified by -Wl or a ‘%x’ spec string The part about -Wl has been obsolete for 27 years, since this change: Author: Torbjorn Granlund <tege@gnu.org> Date: Thu Oct 27 18:04:25 1994 +0000 (process_command): Handle -Wl, and -Xlinker similar to -l, i.e., preserve their order with respect to linker input files. Technically speaking, the arguments of -l, -Wl and -Xlinker are input files. gcc/ * doc/invoke.texi (%X): Remove obsolete reference to -Wl.	2021-10-28 15:55:05 +02:00
Richard Biener	81342e9582	middle-end/84407 - honor -frounding-math for int to float conversion This makes us honor -frounding-math for integer to float conversions and avoid constant folding when such conversion is not exact. 2021-10-28 Richard Biener <rguenther@suse.de> PR middle-end/84407 * fold-const.c (fold_convert_const): Avoid int to float constant folding with -frounding-math and inexact result. * simplify-rtx.c (simplify_const_unary_operation): Likewise for both float and unsigned_float. * gcc.dg/torture/fp-uint64-convert-double-1.c: New testcase. * gcc.dg/torture/fp-uint64-convert-double-2.c: Likewise.	2021-10-28 15:45:22 +02:00
Aldy Hernandez	113dab2b9d	Improve backward threading with switches. We've been essentially using find_taken_edge_switch_expr() in the backward threader, but this is suboptimal because said function only works with singletons. VRP has a much smarter find_case_label_range that works with ranges. Tested on x86-64 Linux with: a) Bootstrap & regtests. b) Verifying we get more threads than before. c) Asserting that the new code catches everything the old one code caught (over a set of bootstrap .ii files). gcc/ChangeLog: * tree-ssa-threadbackward.c (back_threader::find_taken_edge_switch): Use find_case_label_range instead of find_taken_edge. gcc/testsuite/ChangeLog: * gcc.dg/tree-ssa/vrp106.c: Adjust for threading. * gcc.dg/tree-ssa/vrp113.c: Same.	2021-10-28 14:30:06 +02:00
Aldy Hernandez	7f6c225852	Make back_threader_registry inherit from back_jt_path_registry. When a class's only purpose is to expose the methods of its only member, it's really a derived class ;-). Tested on x86-64 Linux. gcc/ChangeLog: * tree-ssa-threadbackward.c (class back_threader_registry): Inherit from back_jt_path_registry. (back_threader_registry::thread_through_all_blocks): Remove. (back_threader_registry::register_path): Remove m_lowlevel_registry prefix.	2021-10-28 14:30:06 +02:00
Richard Biener	a84b9d5373	middle-end/57245 - honor -frounding-math in real truncation The following honors -frounding-math when converting a FP constant to another FP type. 2021-10-27 Richard Biener <rguenther@suse.de> PR middle-end/57245 * fold-const.c (fold_convert_const_real_from_real): Honor -frounding-math if the conversion is not exact. * simplify-rtx.c (simplify_const_unary_operation): Do not simplify FLOAT_TRUNCATE with sign dependent rounding. * gcc.dg/torture/fp-double-convert-float-1.c: New testcase.	2021-10-28 11:28:42 +02:00
Richard Biener	eed248bb8c	tree-optimization/102949 - fix base object alignment This fixes fallout of g:4703182a06b831a9 where we now silently fail to force alignment of a base object. The fix is to look at the dr_info of the group leader to be consistent with alignment analysis. 2021-10-28 Richard Biener <rguenther@suse.de> PR tree-optimization/102949 * tree-vect-stmts.c (ensure_base_align): Look at the dr_info of a group leader and assert we are looking at one with analyzed alignment.	2021-10-28 11:02:38 +02:00
Kewen Lin	b343a29dbc	rs6000: Fix ICE of vect cost related to V1TI [PR102767] As PR102767 shows, the commit r12-3482 exposed one ICE in function rs6000_builtin_vectorization_cost. We claims V1TI supports movmisalign on rs6000 (See define_expand "movmisalign<mode>"), so it return true in rs6000_builtin_support_vector_misalignment for misalign 8. Later in the cost querying function rs6000_builtin_vectorization_cost, we don't have the arms to handle the V1TI input under (TARGET_VSX && TARGET_ALLOW_MOVMISALIGN). The proposed fix is to add the consideration for V1TI, simply make it as the cost for doubleword which is apparently bigger than the cost of scalar, won't have the vectorization to happen, just to keep consistency and avoid ICE. Another thought is to not support movmisalign for V1TI, but it sounds like a bad idea since it doesn't match the reality. Note that this patch also fixes up the wrong indentations around. gcc/ChangeLog: PR target/102767 * config/rs6000/rs6000.c (rs6000_builtin_vectorization_cost): Consider V1T1 mode for unaligned load and store. gcc/testsuite/ChangeLog: PR target/102767 * gcc.target/powerpc/ppc-fortran/pr102767.f90: New file.	2021-10-28 02:45:55 -05:00
Kito Cheng	2dc835cd0b	RISC-V: Fix wrong predicator for zero_extendsidi2_internal pattern We're wrongly guard zero_extendsidi2_internal pattern both ZBA and ZBB, only ZBA provide zero_extendsidi2 instruction. gcc/ChangeLog * config/riscv/riscv.md (zero_extendsidi2_internal): Allow ZBB use this pattern.	2021-10-28 14:53:50 +08:00
Kito Cheng	e399cde6f9	RISC-V: Handle zi* extension correctly for arch-canonicalize script Canonical order for z-prefixed extension are rely on the canonical order of single letter extension, however we didn't put i into the list before, so when we put zicsr or zifencei it will got exception. gcc/ChangeLog: * config/riscv/arch-canonicalize (CANONICAL_ORDER): Add `i` to CANONICAL_ORDER.	2021-10-28 14:49:21 +08:00
Alexandre Oliva	95bb87b245	hardened conditionals This patch introduces optional passes to harden conditionals used in branches, and in computing boolean expressions, by adding redundant tests of the reversed conditions, and trapping in case of unexpected results. Though in abstract machines the redundant tests should never fail, CPUs may be led to misbehave under certain kinds of attacks, such as of power deprivation, and these tests reduce the likelihood of going too far down an unexpected execution path. for gcc/ChangeLog * common.opt (fharden-compares): New. (fharden-conditional-branches): New. * doc/invoke.texi: Document new options. * gimple-harden-conditionals.cc: New. * Makefile.in (OBJS): Build it. * passes.def: Add new passes. * tree-pass.h (make_pass_harden_compares): Declare. (make_pass_harden_conditional_branches): Declare. for gcc/ada/ChangeLog * doc/gnat_rm/security_hardening_features.rst (Hardened Conditionals): New. for gcc/testsuite/ChangeLog * c-c++-common/torture/harden-comp.c: New. * c-c++-common/torture/harden-cond.c: New.	2021-10-28 00:51:02 -03:00
Xionghu Luo	5f9ef1339e	rs6000: Fold xxsel to vsel since they have same semantics Fold xxsel to vsel like xxperm/vperm to avoid duplicate code. gcc/ChangeLog: 2021-10-28 Xionghu Luo <luoxhu@linux.ibm.com> PR target/94613 * config/rs6000/altivec.md: Add vsx register constraints. * config/rs6000/vsx.md (vsx_xxsel<mode>): Delete. (vsx_xxsel<mode>2): Likewise. (vsx_xxsel<mode>3): Likewise. (vsx_xxsel<mode>4): Likewise. gcc/testsuite/ChangeLog: 2021-10-28 Xionghu Luo <luoxhu@linux.ibm.com> * gcc.target/powerpc/builtins-1.c: Adjust.	2021-10-27 22:17:33 -05:00
Xionghu Luo	9222481ffc	rs6000: Fix wrong code generation for vec_sel [PR94613] The vsel instruction is a bit-wise select instruction. Using an IF_THEN_ELSE to express it in RTL is wrong and leads to wrong code being generated in the combine pass. Per element selection is a subset of per bit-wise selection,with the patch the pattern is written using bit operations. But there are 8 different patterns to define "op0 := (op1 & ~op3) \| (op2 & op3)": (~op3&op1) \| (op3&op2), (~op3&op1) \| (op2&op3), (op3&op2) \| (~op3&op1), (op2&op3) \| (~op3&op1), (op1&~op3) \| (op3&op2), (op1&~op3) \| (op2&op3), (op3&op2) \| (op1&~op3), (op2&op3) \| (op1&~op3), The latter 4 cases does not follow canonicalisation rules, non-canonical RTL is invalid RTL in vregs pass. Secondly, combine pass will swap (op1&~op3) to (~op3&op1) by commutative canonical, which could reduce it to the FIRST 4 patterns, but it won't swap (op2&op3) \| (~op3&op1) to (~op3&op1) \| (op2&op3), so this patch handles it with 4 patterns with different NOT op3 position and check equality inside it. Tested pass on P7, P8 and P9. gcc/ChangeLog: 2021-10-28 Xionghu Luo <luoxhu@linux.ibm.com> PR target/94613 * config/rs6000/altivec.md (altivec_vsel<mode>): Change to ... (altivec_vsel<mode>): ... this and update define. (altivec_vsel<mode>_uns): Delete. (altivec_vsel<mode>2): New define_insn. (altivec_vsel<mode>3): Likewise. (altivec_vsel<mode>4): Likewise. * config/rs6000/rs6000-call.c (altivec_expand_vec_sel_builtin): New. (altivec_expand_builtin): Call altivec_expand_vec_sel_builtin to expand vel_sel. * config/rs6000/rs6000.c (rs6000_emit_vector_cond_expr): Use bit-wise selection instead of per element. * config/rs6000/vector.md: * config/rs6000/vsx.md (vsx_xxsel<mode>): Change to ... (vsx_xxsel<mode>): ... this and update define. (vsx_xxsel<mode>_uns): Delete. (vsx_xxsel<mode>2): New define_insn. (vsx_xxsel<mode>3): Likewise. (vsx_xxsel<mode>4): Likewise. gcc/testsuite/ChangeLog: 2021-10-28 Xionghu Luo <luoxhu@linux.ibm.com> PR target/94613 * gcc.target/powerpc/pr94613.c: New test.	2021-10-27 21:21:20 -05:00
Hongyu Wang	5720c450fa	AVX512FP16: Optimize _Float16 reciprocal for div and sqrt For _Float16 type, add insn and expanders to optimize x / y to x * rcp (y), and x / sqrt (y) to x * rsqrt (y). As Half float only have minor precision difference between div and mul * rcp, there is no need for Newton-Rhapson approximation. gcc/ChangeLog: * config/i386/i386.c (use_rsqrt_p): Add mode parameter, enable HFmode rsqrt without TARGET_SSE_MATH. (ix86_optab_supported_p): Refactor rint, adjust floor, ceil, btrunc condition to be restricted by -ftrapping-math, adjust use_rsqrt_p function call. * config/i386/i386.md (rcphf2): New define_insn. (rsqrthf2): Likewise. * config/i386/sse.md (div<mode>3): Change VF2H to VF2. (div<mode>3): New expander for HF mode. (rsqrt<mode>2): Likewise. (avx512fp16_vmrcpv8hf2): New define_insn for rpad pass. (avx512fp16_vmrsqrtv8hf2): Likewise. gcc/testsuite/ChangeLog: * gcc.target/i386/avx512fp16-recip-1.c: New test. * gcc.target/i386/avx512fp16-recip-2.c: Ditto. * gcc.target/i386/pr102464.c: Add -fno-trapping-math.	2021-10-28 09:51:00 +08:00
GCC Administrator	04a2cf3fd6	Daily bump.	2021-10-28 00:16:39 +00:00
Bernhard Reutner-Fischer	b0b1d8d5d9	Fortran: Delete unused decl in intrinsic.h gcc/fortran/ChangeLog: * intrinsic.h (gfc_check_sum, gfc_resolve_atan2d, gfc_resolve_kill, gfc_resolve_kill_sub): Delete declaration.	2021-10-27 21:24:02 +02:00
Bernhard Reutner-Fischer	8bccf82905	Fortran: Delete unused decl in trans-types.h gcc/fortran/ChangeLog: * trans-types.h (gfc_convert_function_code): Delete.	2021-10-27 21:24:02 +02:00
Bernhard Reutner-Fischer	51227c5991	Fortran: Delete unused decl in trans-stmt.h gcc/fortran/ChangeLog: * trans-stmt.h (gfc_trans_deallocate_array): Delete.	2021-10-27 21:24:02 +02:00
Bernhard Reutner-Fischer	a470bfccf1	Fortran: make some trans-array functions static gcc/fortran/ChangeLog: * trans-array.c (gfc_trans_scalarized_loop_end): Make static. * trans-array.h (gfc_trans_scalarized_loop_end, gfc_conv_tmp_ref, gfc_conv_array_transpose): Delete declaration.	2021-10-27 21:24:02 +02:00
Bernhard Reutner-Fischer	e90e0301d5	Fortran: make some constructor* functions static gfc_constructor_expr_foreach and gfc_constructor_swap were just stubs. gcc/fortran/ChangeLog: * constructor.c (gfc_constructor_get_base): Make static. (gfc_constructor_expr_foreach, gfc_constructor_swap): Delete. * constructor.h (gfc_constructor_get_base): Remove declaration. (gfc_constructor_expr_foreach, gfc_constructor_swap): Delete.	2021-10-27 21:23:44 +02:00
Bernhard Reutner-Fischer	28b3a7788e	Fortran: make some match* functions static gfc_match_small_int_expr was unused, delete it. gfc_match_gcc_unroll should use gfc_match_small_literal_int and then gfc_match_small_int can be deleted since it will be unused. gcc/fortran/ChangeLog: * decl.c (gfc_match_old_kind_spec, set_com_block_bind_c, set_verify_bind_c_sym, set_verify_bind_c_com_block, get_bind_c_idents, gfc_match_suffix, gfc_get_type_attr_spec, check_extended_derived_type): Make static. (gfc_match_gcc_unroll): Add comment. * match.c (gfc_match_small_int_expr): Delete definition. * match.h (gfc_match_small_int_expr): Delete declaration. (gfc_match_name_C, gfc_match_old_kind_spec, set_com_block_bind_c, set_verify_bind_c_sym, set_verify_bind_c_com_block, get_bind_c_idents, gfc_match_suffix, gfc_get_type_attr_spec): Delete declaration.	2021-10-27 21:21:58 +02:00
Bernhard Reutner-Fischer	fd39c4bf55	Fortran: make some trans* functions static This makes some trans* functions static and deletes declarations of functions that either do not exist anymore like gfc_get_function_decl or that are unused like gfc_check_any_c_kind. gcc/fortran/ChangeLog: * expr.c (is_non_empty_structure_constructor): Make static. * gfortran.h (gfc_check_any_c_kind): Delete. * match.c (gfc_match_label): Make static. * match.h (gfc_match_label): Delete declaration. * scanner.c (file_changes_cur, file_changes_count, file_changes_allocated): Make static. * trans-expr.c (gfc_get_character_len): Make static. (gfc_class_len_or_zero_get): Make static. (VTAB_GET_FIELD_GEN): Undefine. (gfc_get_class_array_ref): Make static. (gfc_finish_interface_mapping): Make static. * trans-types.c (gfc_check_any_c_kind): Delete. (pfunc_type_node, dtype_type_node, gfc_get_ppc_type): Make static. * trans-types.h (gfc_get_ppc_type): Delete declaration. * trans.c (gfc_msg_wrong_return): Delete. * trans.h (gfc_class_len_or_zero_get, gfc_class_vtab_extends_get, gfc_vptr_extends_get, gfc_get_class_array_ref, gfc_get_character_len, gfc_finish_interface_mapping, gfc_msg_wrong_return, gfc_get_function_decl): Delete declaration.	2021-10-27 21:17:44 +02:00
H.J. Lu	1f98c4e0c5	libffi: Update LOCAL_PATCHES Add commit `90205f67e4` Author: Segher Boessenkool <segher@kernel.crashing.org> Date: Mon Oct 25 23:29:26 2021 +0000 rs6000: Fix bootstrap (libffi) This fixes bootstrap for the current problems building libffi. to LOCAL_PATCHES. * LOCAL_PATCHES: Add commit `90454a9008`.	2021-10-27 11:40:50 -07:00
Saagar Jha	11b9675774	Darwin, config: Amend for Darwin 21 / macOS 12. It seems that the OS major version is now tracking the kernel major version - 9. Minor version has been set to kerne min - 1. Signed-off-by: Iain Sandoe <iain@sandoe.co.uk> Signed-off-by: Saagar Jha <saagar@saagarjha.com> gcc/ChangeLog: * config.gcc: Adjust for Darwin21. * config/darwin-c.c (macosx_version_as_macro): Likewise. * config/darwin-driver.c (validate_macosx_version_min): Likewise. (darwin_find_version_from_kernel): Likewise.	2021-10-27 19:27:25 +01:00
Aldy Hernandez	aeb10f8d2a	Kill known equivalences before a new assignment in the path solver. Every time we have a killing statement, we must also kill the relations seen so far. This is similar to what we did for the equivs inherent in PHIs along a path. Tested on x86-64 and ppc64le Linux. gcc/ChangeLog: * gimple-range-path.cc (path_range_query::range_defined_in_block): Call killing_def.	2021-10-27 20:15:00 +02:00
Aldy Hernandez	2f0b6a971a	Reorder relation calculating code in the path solver. Enabling the fully resolving threader triggers various relation ordering issues that have previously been dormant because the VRP hybrid threader (forward threader based) never gives us long enough paths for this to matter. The new threader spares no punches in finding non-obvious paths, so getting the relations right is paramount. This patch fixes a couple oversights that have gone undetected. First, some background. There are 3 types of relations along a path: a) Relations inherent in a PHI. b) Relations as a side-effect of evaluating a statement. c) Outgoing relations between blocks in a path. We must calculate these in their proper order, otherwise we can run into ordering issues. The current ordering is wrong, as we precalculate PHIs for _all_ blocks before anything else, and then proceed to register the relations throughout the path. Also, we fail to realize that a PHI whose argument is also defined in the PHIs block cannot be registered as an equivalence without causing more ordering issues. This patch fixes all the problems described above. With it we get a handful more net threads, but most importantly, we disallow some threads that were wrong. Tested on x86-64 and ppc64le Linux on the usual regstrap, plus by comparing the different thread counts before and after this patch. gcc/ChangeLog: * gimple-range-fold.cc (fold_using_range::range_of_range_op): Dump operands as well as relation. * gimple-range-path.cc (path_range_query::compute_ranges_in_block): Compute PHI relations first. Compute outgoing relations at the end. (path_range_query::compute_ranges): Remove call to compute_relations. (path_range_query::compute_relations): Remove. (path_range_query::maybe_register_phi_relation): New. (path_range_query::compute_phi_relations): Abstract out registering one PHI relation to... (path_range_query::compute_outgoing_relations): ...here. * gimple-range-path.h (class path_range_query): Remove compute_relations. Add maybe_register_phi_relation.	2021-10-27 20:14:15 +02:00
Aldy Hernandez	9f4edfc1fb	Kill second order relations in the path solver. My upcoming work replacing the VRP threaders with a fully resolving backward threader has tripped over various corner cases in the path sensitive relation oracle. This patch kills second order relations when we kill a relation. Tested on x86-64 and ppc64le Linux. Co-authored-by: Andrew MacLeod <amacleod@redhat.com> gcc/ChangeLog: * value-relation.cc (path_oracle::killing_def): Kill second order relations.	2021-10-27 20:14:15 +02:00
John David Anglin	a1957c9755	Fix warnings building linux-atomic.c and fptr.c on hppa64-linux The file fptr.c is specific to 32-bit hppa-linux and should not be included in LIB2ADD on hppa64-linux. There is a builtin type mismatch in linux-atomic.c using the type long long unsigned int for 64-bit atomic operations on hppa64-linux. 2021-10-27 John David Anglin <danglin@gcc.gnu.org> libgcc/ChangeLog: * config.host (hppa64--linux): Don't add pa/t-linux to tmake_file. * config/pa/linux-atomic.c: Define u8, u16 and u64 types. Use them in FETCH_AND_OP_2, OP_AND_FETCH_2, COMPARE_AND_SWAP_2, SYNC_LOCK_TEST_AND_SET_2 and SYNC_LOCK_RELEASE_1 macros. * config/pa/t-linux64 (LIB1ASMSRC): New define. (LIB1ASMFUNCS): Revise. (HOST_LIBGCC2_CFLAGS): Add "-DLINUX=1".	2021-10-27 18:00:36 +00:00
Martin Sebor	99b1021d21	Fix a typo. gcc/testsuite/ChangeLog: * gcc.dg/Warray-bounds-90.c: Fix a typo.	2021-10-27 09:40:11 -06:00
Martin Jambor	ab810952eb	ipa-cp: Use profile counters (or not) based on local availability This is a follow-up small patch to address Honza's review of my previous patch to select saner profile count to base heuristics on. Currently the IPA-CP heuristics switch to PGO-mode only if there are PGO counters available for any part of the call graph. This change makes it to switch to the PGO mode only if any of the incoming edges bringing in the constant in question had any ipa-quality counts on them. Consequently, if a part of the program is built with -fprofile-use and another part without, IPA-CP will use estimated-frequency-based heuristics for the latter. I still wonder whether this should only happen with flag_profile_partial_training on. It seems like we're behaving as if it was always on. gcc/ChangeLog: 2021-10-18 Martin Jambor <mjambor@suse.cz> * ipa-cp.c (good_cloning_opportunity_p): Decide whether to use profile feedback depending on their local availability.	2021-10-27 15:12:05 +02:00
Martin Jambor	ab1008255e	ipa-cp: Select saner profile count to base heuristics on When profile feedback is available, IPA-CP takes the count of the hottest node and then evaluates all call contexts relative to it. This means that typically almost no clones for specialized contexts are ever created because the maximum is some special function, called from everywhere (that is likely to get inlined anyway) and all the examined edges look cold compared to it. This patch changes the selection. It simply sorts counts of all edges eligible for cloning in a vector and then picks the count in 90th percentile (the actual number is configurable via a parameter). I also tried more complex approaches which were summing the counts and picking the edge which together with all hotter edges accounted for a given portion of the total sum of all edge counts. But first it was not apparently clear to me that they make more logical sense that the simple method and practically I always also had to ignore a few percent of the hottest edges with really extreme counts (looking at bash and python). And when I had to do that anyway, it seemed simpler to just "ignore" more and take the first non-ignored count as the base. Nevertheless, if people think some more sophisticated method should be used anyway, I am willing to be persuaded. But this patch is a clear improvement over the current situation. gcc/ChangeLog: 2021-10-26 Martin Jambor <mjambor@suse.cz> * params.opt (param_ipa_cp_profile_count_base): New parameter. * doc/invoke.texi (Optimize Options): Add entry for ipa-cp-profile-count-base. * ipa-cp.c (max_count): Replace with base_count, replace all occurrences too, unless otherwise stated. (ipcp_cloning_candidate_p): identify mostly-directly called functions based on their counts, not max_count. (compare_edge_profile_counts): New function. (ipcp_propagate_stage): Instead of setting max_count, find the appropriate edge count in a sorted vector of counts of eligible edges and make it the base_count.	2021-10-27 15:11:47 +02:00
Martin Jambor	d1e2e4f9ce	ipa-cp: Fix updating of profile counts and self-gen value evaluation IPA-CP does not do a reasonable job when it is updating profile counts after it has created clones of recursive functions. This patch addresses that by: 1. Only updating counts for special-context clones. When a clone is created for all contexts, the original is going to be dead and the cgraph machinery has copied counts to the new node which is the right thing to do. Therefore updating counts has been moved from create_specialized_node to decide_about_value and decide_whether_version_node. 2. The current profile updating code artificially increased the assumed old count when the sum of counts of incoming edges to both the original and new node were bigger than the count of the original node. This always happened when self-recursive edge from the clone was also redirected to the clone because both the original edge and its clone had original high counts. This clutch was removed and replaced by the next point. 3. When cloning also redirects a self-recursive clone to the clone itself, new logic has been added to divide the counts brought by such recursive edges between the original node and the clone. This is impossible to do well without special knowledge about the function and which non-recursive entry calls are responsible for what portion of recursion depth, so the approach taken is rather crude. For local nodes, we detect the case when the original node is never called (in the training run at least) with another value and if so, steal all its counts like if it was dead. If that is not the case, we try to divide the count brought by recursive edges (or rather not brought by direct edges) proportionally to the counts brought by non-recursive edges - but with artificial limits in place so that we do not take too many or too few, because that was happening with detrimental effect in mcf_r. 4. When cloning creates extra clones for values brought by a formerly self-recursive edge with an arithmetic pass-through jump function on it, such as it does in exchange2_r, all such clones are processed at once rather than one after another. The counts of all such nodes are distributed evenly (modulo even-formerly-non-recursive-edges) and the whole situation is then fixed up so that the edge counts fit. This is what new function update_counts_for_self_gen_clones does. 5. When values brought by a formerly self-recursive edge with an arithmetic pass-through jump function on it are evaluated by heuristics which assumes vast majority of node counts are result of recursive calls and so we simply divide those with the number of clones there would be if we created another one. 6. The mechanisms in init_caller_stats and gather_caller_stats and get_info_about_necessary_edges was enhanced to gather data required for the above and a missing check not to count dead incoming edges was also added. gcc/ChangeLog: 2021-10-15 Martin Jambor <mjambor@suse.cz> * ipa-cp.c (struct caller_statistics): New fields rec_count_sum, n_nonrec_calls and itself, document all fields. (init_caller_stats): Initialize the above new fields. (gather_caller_stats): Gather self-recursive counts and calls number. (get_info_about_necessary_edges): Gather counts of self-recursive and other edges bringing in the requested value separately. (dump_profile_updates): Rework to dump info about a single node only. (lenient_count_portion_handling): New function. (struct gather_other_count_struct): New type. (gather_count_of_non_rec_edges): New function. (struct desc_incoming_count_struct): New type. (analyze_clone_icoming_counts): New function. (adjust_clone_incoming_counts): Likewise. (update_counts_for_self_gen_clones): Likewise. (update_profiling_info): Rewritten. (update_specialized_profile): Adjust call to dump_profile_updates. (create_specialized_node): Do not update profiling info. (decide_about_value): New parameter self_gen_clones, either push new clones into it or updat their profile counts. For self-recursively generated values, use a portion of the node count instead of count from self-recursive edges to estimate goodness. (decide_whether_version_node): Gather clones for self-generated values in a new vector, update their profiles at once at the end.	2021-10-27 14:49:56 +02:00
Richard Biener	b528e226d1	Refactor try_vectorize_loop_1 This refactors epilogue loop handling in try_vectorize_loop_1 to not suggest we're analyzing those there by splitting out the transform phase which then can handle the epilogues. 2021-10-27 Richard Biener <rguenther@suse.de> * tree-vectorizer.c (vect_transform_loops): New function, split out from ... (try_vectorize_loop_1): ... here. Simplify as epilogues are now fully handled in the split part.	2021-10-27 11:30:16 +02:00
Tobias Burnus	7f899b23f3	Fortran: Fix 'select rank' for allocatables/pointers gcc/fortran/ChangeLog: * trans-stmt.c (gfc_trans_select_rank_cases): Fix condition for allocatables/pointers. gcc/testsuite/ChangeLog: * gfortran.dg/PR93963.f90: Extend testcase by scan-tree-dump test.	2021-10-27 10:59:27 +02:00
Jakub Jelinek	4f1fe0dc25	testsuite: Fix up gcc.dg/pr102897.c testcase [PR102897] The testcase FAILs on i686-linux due to: FAIL: gcc.dg/pr102897.c (test for excess errors) Excess errors: .../gcc/gcc/testsuite/gcc.dg/pr102897.c:11:1: warning: MMX vector return without MMX enabled changes the ABI [-Wpsabi] .../gcc/gcc/testsuite/gcc.dg/pr102897.c:10:10: warning: MMX vector argument without MMX enabled changes the ABI [-Wpsabi] Fixed by adding -Wno-psabi. 2021-10-27 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/102897 * gcc.dg/pr102897.c: Add -Wno-psabi to dg-options.	2021-10-27 09:43:40 +02:00
Jakub Jelinek	eef8114906	openmp: Document that non-rect loops are not supported in Fortran yet I've found we claim to support non-rectangular loops, but don't actually support those in Fortran, as can be seen on: integer i, j !$omp parallel do collapse(2) do i = 0, 10 do j = 0, i end do end do end To support this, the Fortran FE needs to allow the valid forms of non-rectangular loops and disallow others, so mainly it needs its updated version of c-omp.c c_omp_check_loop_iv etc., plus for non-rectangular lb or ub expressions emit a TREE_VEC instead of normal expression as the C/C++ FE do, plus testsuite coverage. 2021-10-27 Jakub Jelinek <jakub@redhat.com> * libgomp.texi (OpenMP 5.0): Mention that Non-rectangular loop nests aren't implemented for Fortran yet.	2021-10-27 09:24:46 +02:00
Jakub Jelinek	2084b5f42a	openmp: Allow non-rectangular loops with pointer iterators This patch handles pointer iterators for non-rectangular loops. They are more limited than integral iterators of non-rectangular loops, in particular only var-outer, var-outer + a2, a2 + var-outer or var-outer - a2 can appear in lb or ub where a2 is some integral loop invariant expression, so no e.g. multiplication etc. 2021-10-27 Jakub Jelinek <jakub@redhat.com> gcc/ * omp-expand.c (expand_omp_for_init_counts): Handle non-rectangular iterators with pointer types. (expand_omp_for_init_vars, extract_omp_for_update_vars): Likewise. gcc/c-family/ * c-omp.c (c_omp_check_loop_iv_r): Don't clear 3rd bit for POINTER_PLUS_EXPR. (c_omp_check_nonrect_loop_iv): Handle POINTER_PLUS_EXPR. (c_omp_check_loop_iv): Set kind even if the iterator is non-integral. gcc/testsuite/ * c-c++-common/gomp/loop-8.c: New test. * c-c++-common/gomp/loop-9.c: New test. libgomp/ * testsuite/libgomp.c/loop-26.c: New test. * testsuite/libgomp.c/loop-27.c: New test.	2021-10-27 09:22:07 +02:00
Jakub Jelinek	6b0f35299b	openmp: Don't reject some valid initializers or conditions of non-rectangular loops [PR102854] In C++, if an iterator has or might have (e.g. dependent type) class type we remember the original init expressions and check those separately for presence of iterators, because for class iterators we turn those into expressions that always do contain reference to the current iterator. But this resulted in rejecting valid non-rectangular loop where the dependent type is later instantiated to an integral type. Non-rectangular loops with class random access iterators remain broken, that is something to be fixed incrementally. 2021-10-27 Jakub Jelinek <jakub@redhat.com> PR c++/102854 gcc/c-family/ * c-common.h (c_omp_check_loop_iv_exprs): Add enum tree_code argument. * c-omp.c (c_omp_check_loop_iv_r): For trees other than decls, TREE_VEC, PLUS_EXPR, MINUS_EXPR, MULT_EXPR, POINTER_PLUS_EXPR or conversions temporarily clear the 3rd bit from d->kind while walking subtrees. (c_omp_check_loop_iv_exprs): Add CODE argument. Or in 4 into data.kind if possibly non-rectangular. gcc/cp/ * semantics.c (handle_omp_for_class_iterator, finish_omp_for): Adjust c_omp_check_loop_iv_exprs caller. gcc/testsuite/ * g++.dg/gomp/loop-3.C: Don't expect some errors. * g++.dg/gomp/loop-7.C: New test.	2021-10-27 09:16:48 +02:00
Jakub Jelinek	7473b8a904	c++: Reject addresses of immediate functions in constexpr vars inside of immediate functions or consteval if [PR102753] Another thing that wasn't in the previous patch, but I'm wondering whether we don't handle it incorrectly. constexpr.c has: /* Check that immediate invocation does not return an expression referencing any immediate function decls. They need to be allowed while parsing immediate functions, but can't leak outside of them. / if (is_consteval && t != r && (current_function_decl == NULL_TREE \|\| !DECL_IMMEDIATE_FUNCTION_P (current_function_decl))) as condition for the discovery of embedded immediate FUNCTION_DECLs (or now PTRMEM_CSTs). If I remove the && (current... ..._decl)) then g++.dg/cpp2a/consteval7.C's struct S { int b; int (c) (); }; consteval S baz () { return { 5, foo }; } consteval int qux () { S s = baz (); return s.b + s.c (); } consteval int quux () { constexpr S s = baz (); return s.b + s.c (); } quux line fails, but based on http://eel.is/c++draft/expr.const#11 I wonder if it shouldn't fail (clang++ -std=c++20 rejects it), and be only accepted without the constexpr keyword before S s. Also wonder about e.g. consteval int foo () { return 42; } consteval int bar () { auto fn1 = foo; // This must be ok constexpr auto fn2 = foo; // Isn't this an error? return fn1 () + fn2 (); } constexpr int baz () { if consteval { auto fn1 = foo; // This must be ok constexpr auto fn2 = foo; // Isn't this an error? return fn1 () + fn2 (); } return 0; } auto a = bar (); static_assert (bar () == 84); static_assert (baz () == 84); (again, clang++ -std=c++20 rejects the fn2 = foo; case, but doesn't implement consteval if, so can't test the other one). For taking address of an immediate function or method if it is taken outside of immediate function context we already have diagnostics about it, but shouldn't the immediate FUNCTION_DECL discovery in cxx_eval_outermost_constant_expression be instead guarded with something like if (is_consteval \|\| in_immediate_context ()) and be done regardless of whether t != r? 2021-10-27 Jakub Jelinek <jakub@redhat.com> PR c++/102753 * constexpr.c (cxx_eval_outermost_constant_expr): Perform find_immediate_fndecl discovery if is_consteval or in_immediate_context () rather than if is_consteval, t != r and not in immediate function's body. * g++.dg/cpp2a/consteval7.C: Expect diagnostics on quux. * g++.dg/cpp2a/consteval24.C: New test. * g++.dg/cpp23/consteval-if12.C: New test.	2021-10-27 09:08:19 +02:00

... 3 4 5 6 7 ...

189389 Commits All Branches Search

189389 Commits

All Branches