OpenE2K/gcc - gcc - Expired Mentality Git

Author	SHA1	Message	Date
Bill Schmidt	fce8a52d0a	rs6000: Add power7 and power7-64 builtins 2021-04-02 Bill Schmidt <wschmidt@linux.ibm.com> gcc/ * config/rs6000/rs6000-builtin-new.def: Add power7 and power7-64 stanzas.	2021-08-24 09:14:34 -05:00
Andrew MacLeod	675a3e4056	Add transitive operations to the relation oracle. When registering relations in the oracle, search for other relations which imply new transitive relations. gcc/ * value-relation.cc (rr_transitive_table): New. (relation_transitive): New. (value_relation::swap): Remove. (value_relation::apply_transitive): New. (relation_oracle::relation_oracle): Allocate a new tmp bitmap. (relation_oracle::register_relation): Call register_transitives. (relation_oracle::register_transitives): New. * value-relation.h (relation_oracle): Add new temporary bitmap and methods. gcc/testsuite/ * gcc.dg/predict-1.c: Disable evrp. * gcc.dg/tree-ssa/evrp-trans.c: New.	2021-08-24 09:44:31 -04:00
Jonathan Wakely	d8b7282ea2	libstdc++: Fix mismatched class-key tags Clang warns about this, but GCC doesn't (see PR c++/102036). Signed-off-by: Jonathan Wakely <jwakely@redhat.com> libstdc++-v3/ChangeLog: * src/c++11/cxx11-shim_facets.cc: Fix mismatched class-key in explicit instantiation definitions.	2021-08-24 14:42:24 +01:00
H.J. Lu	6e5401e87d	x86: Broadcast from integer to a pseudo vector register Broadcast from integer to a pseudo vector register instead of a hard vector register to allow LRA to remove redundant move instruction after broadcast. gcc/ PR target/102021 * config/i386/i386-expand.c (ix86_expand_vector_move): Broadcast from integer to a pseudo vector register. gcc/testsuite/ PR target/102021 * gcc.target/i386/pr100865-10b.c: Expect vzeroupper. * gcc.target/i386/pr100865-4b.c: Likewise. * gcc.target/i386/pr100865-6b.c: Expect vmovdqu and vzeroupper. * gcc.target/i386/pr100865-7b.c: Likewise. * gcc.target/i386/pr102021.c: New test.	2021-08-24 05:46:17 -07:00
Richard Biener	9216ee6d11	tree-optimization/100089 - avoid leaving scalar if-converted code around This avoids leaving scalar if-converted code around for the case of BB vectorizing an if-converted loop body when using the very-cheap cost model. In this case we scan not vectorized scalar stmts in the basic-block vectorized for COND_EXPRs and force the vectorization to be marked as not profitable. The patch also makes sure to always consider all BB vectorization subgraphs together for costing purposes when vectorizing an if-converted loop body. 2021-08-24 Richard Biener <rguenther@suse.de> PR tree-optimization/100089 * tree-vectorizer.h (vect_slp_bb): Rename to ... (vect_slp_if_converted_bb): ... this and get the original loop as new argument. * tree-vectorizer.c (try_vectorize_loop_1): Revert previous fix, pass original loop to vect_slp_if_converted_bb. * tree-vect-slp.c (vect_bb_vectorization_profitable_p): If orig_loop was passed scan the not vectorized stmts for COND_EXPRs and force not profitable if found. (vect_slp_region): Pass down all SLP instances to costing if orig_loop was specified. (vect_slp_bbs): Pass through orig_loop. (vect_slp_bb): Rename to ... (vect_slp_if_converted_bb): ... this and get the original loop as new argument. (vect_slp_function): Adjust.	2021-08-24 14:23:00 +02:00
Richard Earnshaw	809330ab84	arm: Add tests for VLLDM mitigation [PR102035] New tests for the erratum mitigation. gcc/testsuite: PR target/102035 * gcc.target/arm/cmse/mainline/8_1m/soft/cmse-13a.c: New test. * gcc.target/arm/cmse/mainline/8_1m/soft/cmse-7a.c: Likewise. * gcc.target/arm/cmse/mainline/8_1m/soft/cmse-8a.c: Likewise. * gcc.target/arm/cmse/mainline/8_1m/softfp-sp/cmse-7a.c: Likewise. * gcc.target/arm/cmse/mainline/8_1m/softfp-sp/cmse-8a.c: Likewise. * gcc.target/arm/cmse/mainline/8_1m/softfp/cmse-13a.c: Likewise. * gcc.target/arm/cmse/mainline/8_1m/softfp/cmse-7a.c: Likewise. * gcc.target/arm/cmse/mainline/8_1m/softfp/cmse-8a.c: Likewise.	2021-08-24 11:45:13 +01:00
Richard Earnshaw	30461cf8db	arm: fix vlldm erratum for Armv8.1-m [PR102035] For Armv8.1-m we generate code that emits VLLDM directly and do not rely on support code in the library, so emit the mitigation directly as well, when required. In this case, we can use the compiler options to determine when to apply the fix and when it is safe to omit it. gcc: PR target/102035 * config/arm/arm.md (attribute arch): Add fix_vlldm. (arch_enabled): Use it. * config/arm/vfp.md (lazy_store_multiple_insn): Add alternative to use when erratum mitigation is needed.	2021-08-24 11:45:13 +01:00
Richard Earnshaw	574e7950bd	arm: add erratum mitigation to __gnu_cmse_nonsecure_call [PR102035] Add the recommended erratum mitigation sequence to __gnu_cmse_nonsecure_call for use on Armv8-m.main devices. Since this is in the library code we cannot know in advance whether the core we are running on will be affected by this, so always enable it. libgcc: PR target/102035 * config/arm/cmse_nonsecure_call.S (__gnu_cmse_nonsecure_call): Add vlldm erratum work-around.	2021-08-24 11:45:13 +01:00
Richard Earnshaw	3929bca9ca	arm: Add command-line option for enabling CVE-2021-35465 mitigation [PR102035] Add a new option, -mfix-cmse-cve-2021-35465 and document it. Enable it automatically for cortex-m33, cortex-m35p and cortex-m55. gcc: PR target/102035 * config/arm/arm.opt (mfix-cmse-cve-2021-35465): New option. * doc/invoke.texi (Arm Options): Document it. * config/arm/arm-cpus.in (quirk_vlldm): New feature bit. (ALL_QUIRKS): Add quirk_vlldm. (cortex-m33): Add quirk_vlldm. (cortex-m35p, cortex-m55): Likewise. * config/arm/arm.c (arm_option_override): Enable fix_vlldm if targetting an affected CPU and not explicitly controlled on the command line.	2021-08-24 11:45:13 +01:00
Richard Earnshaw	79fb2700bd	arm: testsuite: improve detection of CMSE hardware. The test for CMSE support being available in hardware currently relies on the compiler not optimizing away a secure gateway operation. But even that is suspect, because the SG instruction is just a NOP on armv8-m implementations that do not support the security extension. Replace the existing test with a new one that reads and checks the appropriate hardware feature register (memory mapped). This has to be run from secure mode, but that shouldn't matter, because if we can't do that we can't really test the CMSE extensions anyway. We retain the SG instruction to ensure the test can't pass accidentally if run on pre-armv8-m devices. gcc/testsuite: * lib/target-supports.exp (check_effective_target_arm_cmse_hw): Check the CMSE feature register, rather than relying on the SG operation causing an execution fault.	2021-08-24 11:45:12 +01:00
Richard Earnshaw	4702d3cf04	arm: Fix general issues with patterns for VLLDM and VLSTM Both lazy_store_multiple_insn and lazy_load_multiple_insn contain invalid RTL (eg they contain a post_inc statement outside of a mem). What's more, the instructions concerned do not modify their input address register. We probably got away with this because they are generated so late in the compilation that no subsequent pass needed to understand them. Nevertheless, this could cause problems someday, so fixed to use a simple legal unspec. gcc: * config/arm/vfp.md (lazy_store_multiple_insn): Rewrite as valid RTL. (lazy_load_multiple_insn): Likewise.	2021-08-24 11:45:12 +01:00
liuhongt	8da9b4f73c	Enable avx512 embedde broadcast for vpternlog. gcc/ChangeLog: PR target/101989 * config/i386/sse.md (<avx512>_vternlog<mode><sd_maskz_name>): Enable avx512 embedded broadcast. (<avx512>_vternlog<mode>_all): Ditto. (<avx512>_vternlog<mode>_mask): Ditto. gcc/testsuite/ChangeLog: PR target/101989 gcc.target/i386/pr101989-broadcast-1.c: New test.	2021-08-24 18:33:02 +08:00
liuhongt	6ddb30f941	Optimize (a & b) \| (c & ~b) to vpternlog instruction. Also optimize below 3 forms to vpternlog, op1, op2, op3 are register_operand or unary_p as (not reg) A: (any_logic (any_logic op1 op2) op3) B: (any_logic (any_logic op1 op2) (any_logic op3 op4)) op3/op4 should be equal to op1/op2 C: (any_logic (any_logic (any_logic:op1 op2) op3) op4) op3/op4 should be equal to op1/op2 gcc/ChangeLog: PR target/101989 * config/i386/i386.c (ix86_rtx_costs): Define cost for UNSPEC_VTERNLOG. * config/i386/i386.h (STRIP_UNARY): New macro. * config/i386/predicates.md (reg_or_notreg_operand): New predicate. * config/i386/sse.md (<avx512>_vternlog<mode>_all): New define_insn. (<avx512>_vternlog<mode>_1): New pre_reload define_insn_and_split. (<avx512>_vternlog<mode>_2): Ditto. (<avx512>_vternlog<mode>_3): Ditto. (any_logic1,any_logic2): New code iterator. (logic_op): New code attribute. (ternlogsuffix): Extend to VNxDF and VNxSF. gcc/testsuite/ChangeLog: PR target/101989 * gcc.target/i386/pr101989-1.c: New test. * gcc.target/i386/pr101989-2.c: New test. * gcc.target/i386/avx512bw-shiftqihi-constant-1.c: Adjust testcase.	2021-08-24 17:45:33 +08:00
Richard Biener	8571ff0ae0	Adjust inner loop cost scaling This makes use of the estimated number of iterations of the inner loop to limit --param vect-inner-loop-cost-factor scaling. It also reduces the maximum value of vect-inner-loop-cost-factor to 10000 making it less likely to cause overflow of costs. 2021-08-23 Richard Biener <rguenther@suse.de> * doc/invoke.texi (vect-inner-loop-cost-factor): Adjust. * params.opt (--param vect-inner-loop-cost-factor): Adjust maximum value. * tree-vect-loop.c (vect_analyze_loop_form): Initialize inner_loop_cost_factor to the minimum of the estimated number of iterations of the inner loop and vect-inner-loop-cost-factor.	2021-08-24 10:43:10 +02:00
Andrew Pinski	0deabebedd	Fix a few problems with download_prerequisites. There are a few problems with download_prerequisites are described in PR 82704. The first is on busy-box version of shasum and md5sum the extended option --check don't exist so just use -c. The second issue is the code for which shasum program to use is included twice and is different. So move which program to use for the checksum after argument parsing. The last issue is --md5 option has been broken for sometime now as the program is named md5sum and not just md5. Nobody updated switch table to be correct. contrib/ChangeLog: PR other/82704 * download_prerequisites: Fix issues with --md5 and --sha512 options.	2021-08-24 08:09:53 +00:00
Roger Sayle	f897716613	Tweak -Os costs for scalar-to-vector pass. Back in June I briefly mentioned in one of my gcc-patches posts that a change that should have always reduced code size, would mysteriously occasionally result in slightly larger code (according to CSiBE): https://gcc.gnu.org/pipermail/gcc-patches/2021-June/573233.html Investigating further, the cause turns out to be that x86_64's scalar-to-vector (stv) pass is relying on poor estimates of the size costs/benefits. This patch tweaks the backend's compute_convert_gain method to provide slightly more accurate values when compiling with -Os. Compilation without -Os is (should be) unaffected. And for completeness, I'll mention that the stv pass is a net win for code size so it's much better to improve its heuristics than simply gate the pass on !optimize_for_size. The net effect of this change is to save 1399 bytes on the CSiBE code size benchmark when compiling with -Os. 2021-08-24 Roger Sayle <roger@nextmovesoftware.com> Richard Biener <rguenther@suse.de> gcc/ChangeLog * config/i386/i386-features.c (compute_convert_gain): Provide more accurate values for CONST_INT, when optimizing for size. * config/i386/i386.c (COSTS_N_BYTES): Move definition from here... * config/i386/i386.h (COSTS_N_BYTES): to here.	2021-08-24 03:04:48 +01:00
Roger Sayle	78fa5112b4	[Committed] PR middle-end/102029: Stricter typing in LSHIFT_EXPR sign folding. My sincere apologies to everyone (again). As diagnosed by Jakub Jelinek, my recent patch to fold the signedness of LSHIFT_EXPR needs to be careful not to attempt transforming a left shift in an integer type into an invalid left shift of a pointer type. 2021-08-24 Roger Sayle <roger@nextmovesoftware.com> Jakub Jelinek <jakub@redhat.com> gcc/ChangeLog PR middle-end/102029 * match.pd (shift transformations): Add an additional check for !POINTER_TYPE_P in the recently added left shift transformation. gcc/testsuite/ChangeLog PR middle-end/102029 * gcc.dg/fold-convlshift-3.c: New test case.	2021-08-24 02:59:02 +01:00
liuhongt	819b7c3a33	Disable slp in loop vectorizer when cost model is very-cheap. Performance impact for the commit with option: -march=x86-64 -O2 -ftree-vectorize -fvect-cost-model=very-cheap SPEC2017 fprate 503.bwaves_r BuildSame 507.cactuBSSN_r -0.04 508.namd_r 0.14 510.parest_r -0.54 511.povray_r 0.10 519.lbm_r BuildSame 521.wrf_r 0.64 526.blender_r -0.32 527.cam4_r 0.17 538.imagick_r 0.09 544.nab_r BuildSame 549.fotonik3d_r BuildSame 554.roms_r BuildSame 997.specrand_fr -0.09 Geometric mean: 0.02 SPEC2017 intrate 500.perlbench_r 0.26 502.gcc_r 0.21 505.mcf_r -0.09 520.omnetpp_r BuildSame 523.xalancbmk_r BuildSame 525.x264_r -0.41 531.deepsjeng_r BuildSame 541.leela_r 0.13 548.exchange2_r BuildSame 557.xz_r BuildSame 999.specrand_ir BuildSame Geometric mean: 0.02 EEMBC: no regression, only improvement or build the same, the below is improved benchmarks. mp2decoddata1 7.59 mp2decoddata2 31.80 mp2decoddata3 12.15 mp2decoddata4 11.16 mp2decoddata5 11.19 mp2decoddata1 7.06 mp2decoddata2 24.12 mp2decoddata3 10.83 mp2decoddata4 10.04 mp2decoddata5 10.07 gcc/ChangeLog: PR tree-optimization/100089 * tree-vectorizer.c (try_vectorize_loop_1): Disable slp in loop vectorizer when cost model is very-cheap.	2021-08-24 09:27:53 +08:00
GCC Administrator	38b19c5b08	Daily bump.	2021-08-24 00:17:00 +00:00
David Malcolm	8ca7fa84a3	analyzer: rewrite of switch handling When investigating false positives on the Linux kernel from -Wanalyzer-use-of-uninitialized-value, I noticed that the existing implementation of switch statements in the analyzer is broken. Specifically, the existing implementation assumes a 1:1 association between CFG out-edges from the basic block and case labels in the gimple switch statement. This happened to be the case in the examples I had tested, but there is no such association in general. In particular, in the motivating example: arch/x86/kernel/cpu/mtrr/if.c: mtrr_ioctl the switch statement has 3 blocks, each covering multiple ranges of ioctl command IDs for which different local variables are initialized, which the existing implementation gets badly wrong. [1] This patch reimplements switch handling in the analyzer to eliminate this false assumption - instead, for each out-edge we gather the set of case labels for that out-edge, and use that to determine the set of value ranges for the edge. Avoiding false positives for the above example requires that we accurately track value ranges for symbolic values, so the patch extends constraint_manager with a new bounded_ranges_constraint, adding just enough information to capture the ranges for switch statements whilst retaining combatility with the existing constraint-handling (ultimately I'd prefer to simply throw all of this into a SAT solver and let it track things). Doing so fixes the false positives seen on the Linux kernel and an existing xfail in the test suite. The patch also fixes a long-standing bug in constraint_manager::add_unknown_constraint when updating constraints due to combining equivalence classes, spotted when debugging the same logic for the new kind of constraints. [1] a reduced version of this code is captured in this patch, in gcc.dg/analyzer/torture/switch-3.c gcc/analyzer/ChangeLog: * analyzer.h (struct rejected_constraint): Convert to... (class rejected_constraint): ...this. (class bounded_ranges): New forward decl. (class bounded_ranges_manager): New forward decl. * constraint-manager.cc: Include "analyzer/analyzer-logging.h" and "tree-pretty-print.h". (can_plus_one_p): New. (plus_one): New. (can_minus_one_p): New. (minus_one): New. (bounded_range::bounded_range): New. (dump_cst): New. (bounded_range::dump_to_pp): New. (bounded_range::dump): New. (bounded_range::to_json): New. (bounded_range::set_json_attr): New. (bounded_range::contains_p): New. (bounded_range::intersects_p): New. (bounded_range::operator==): New. (bounded_range::cmp): New. (bounded_ranges::bounded_ranges): New. (bounded_ranges::bounded_ranges): New. (bounded_ranges::bounded_ranges): New. (bounded_ranges::canonicalize): New. (bounded_ranges::validate): New. (bounded_ranges::operator==): New. (bounded_ranges::dump_to_pp): New. (bounded_ranges::dump): New. (bounded_ranges::to_json): New. (bounded_ranges::eval_condition): New. (bounded_ranges::contain_p): New. (bounded_ranges::cmp): New. (bounded_ranges_manager::~bounded_ranges_manager): New. (bounded_ranges_manager::get_or_create_empty): New. (bounded_ranges_manager::get_or_create_point): New. (bounded_ranges_manager::get_or_create_range): New. (bounded_ranges_manager::get_or_create_union): New. (bounded_ranges_manager::get_or_create_intersection): New. (bounded_ranges_manager::get_or_create_inverse): New. (bounded_ranges_manager::consolidate): New. (bounded_ranges_manager::get_or_create_ranges_for_switch): New. (bounded_ranges_manager::create_ranges_for_switch): New. (bounded_ranges_manager::make_case_label_ranges): New. (bounded_ranges_manager::log_stats): New. (bounded_ranges_constraint::print): New. (bounded_ranges_constraint::to_json): New. (bounded_ranges_constraint::operator==): New. (bounded_ranges_constraint::add_to_hash): New. (constraint_manager::constraint_manager): Update for new field m_bounded_ranges_constraints. (constraint_manager::operator=): Likewise. (constraint_manager::hash): Likewise. (constraint_manager::operator==): Likewise. (constraint_manager::print): Likewise. (constraint_manager::dump_to_pp): Likewise. (constraint_manager::to_json): Likewise. (constraint_manager::add_unknown_constraint): Update the lhs_ec_id if necessary in existing constraints when combining equivalence classes. Add similar code for handling m_bounded_ranges_constraints. (constraint_manager::add_constraint_internal): Add comment. (constraint_manager::add_bounded_ranges): New. (constraint_manager::eval_condition): Use new field m_bounded_ranges_constraints. (constraint_manager::purge): Update bounded_ranges_constraint instances. (constraint_manager::canonicalize): Update for new field. (merger_fact_visitor::on_ranges): New. (constraint_manager::for_each_fact): Use new field m_bounded_ranges_constraints. (constraint_manager::validate): Fix off-by-one error needed due to bug fixed above in add_unknown_constraint. Validate the EC IDs in m_bounded_ranges_constraints. (constraint_manager::get_range_manager): New. (selftest::assert_dump_bounded_range_eq): New. (ASSERT_DUMP_BOUNDED_RANGE_EQ): New. (selftest::test_bounded_range): New. (selftest::assert_dump_bounded_ranges_eq): New. (ASSERT_DUMP_BOUNDED_RANGES_EQ): New. (selftest::test_bounded_ranges): New. (selftest::run_constraint_manager_tests): Call the new selftests. * constraint-manager.h (struct bounded_range): New. (struct bounded_ranges): New. (template <> struct default_hash_traits<bounded_ranges::key_t>): New. (class bounded_ranges_manager): New. (fact_visitor::on_ranges): New pure virtual function. (class bounded_ranges_constraint): New. (constraint_manager::add_bounded_ranges): New decl. (constraint_manager::get_range_manager): New decl. (constraint_manager::m_bounded_ranges_constraints): New field. * diagnostic-manager.cc (epath_finder::process_worklist_item): Transfer ownership of rc to add_feasibility_problem. * engine.cc (feasibility_problem::dump_to_pp): Use get_model. * feasible-graph.cc (infeasible_node::dump_dot): Update for conversion of m_rc to a pointer. (feasible_graph::add_feasibility_problem): Pass RC by pointer and take ownership. * feasible-graph.h (infeasible_node::infeasible_node): Pass RC by pointer and take ownership. (infeasible_node::~infeasible_node): New. (infeasible_node::m_rc): Convert to a pointer. (feasible_graph::add_feasibility_problem): Pass RC by pointer and take ownership. * region-model-manager.cc: Include "analyzer/constraint-manager.h". (region_model_manager::region_model_manager): Initializer new field m_range_mgr. (region_model_manager::~region_model_manager): Delete it. (region_model_manager::log_stats): Call log_stats on it. * region-model.cc (region_model::add_constraint): Use new subclass rejected_op_constraint. (region_model::apply_constraints_for_gswitch): Reimplement using bounded_ranges_manager. (rejected_constraint::dump_to_pp): Convert to... (rejected_op_constraint::dump_to_pp): ...this. (rejected_ranges_constraint::dump_to_pp): New. * region-model.h (struct purge_stats): Add field m_num_bounded_ranges_constraints. (region_model_manager::get_range_manager): New. (region_model_manager::m_range_mgr): New. (region_model::get_range_manager): New. (struct rejected_constraint): Split into... (class rejected_constraint):...this new abstract base class, and... (class rejected_op_constraint): ...this new concrete subclass. (class rejected_ranges_constraint): New. * supergraph.cc: Include "tree-cfg.h". (supergraph::supergraph): Drop idx param from add_cfg_edge. (supergraph::add_cfg_edge): Drop idx param. (switch_cfg_superedge::switch_cfg_superedge): Move here from header. Populate m_case_labels with all cases which go to DST. (switch_cfg_superedge::dump_label_to_pp): Reimplement to use m_case_labels. (switch_cfg_superedge::get_case_label): Delete. * supergraph.h (supergraphadd_cfg_edge): Drop "idx" param. (switch_cfg_superedge::switch_cfg_superedge): Drop idx param and move implementation to supergraph.cc. (switch_cfg_superedge::get_case_label): Delete. (switch_cfg_superedge::get_case_labels): New. (switch_cfg_superedge::m_idx): Delete. (switch_cfg_superedge::m_case_labels): New field. gcc/testsuite/ChangeLog: * gcc.dg/analyzer/switch.c: Remove xfail. Add various tests. * gcc.dg/analyzer/torture/switch-2.c: New test. * gcc.dg/analyzer/torture/switch-3.c: New test. * gcc.dg/analyzer/torture/switch-4.c: New test. * gcc.dg/analyzer/torture/switch-5.c: New test.	2021-08-23 19:27:21 -04:00
Bill Schmidt	192d4edd15	rs6000: Fix AIX bootstrap (don't call asprintf) 2021-08-23 Bill Schmidt <wschmidt@linux.ibm.com> gcc/ * config/rs6000/rs6000-gen-builtins.c (parse_bif_entry): Don't call asprintf, which is not available on AIX.	2021-08-23 17:28:37 -05:00
Bill Schmidt	596f964f32	rs6000: Add gengtype handling to the build machinery 2021-06-07 Bill Schmidt <wschmidt@linux.ibm.com> gcc/ * config.gcc (target_gtfiles): Add ./rs6000-builtins.h. * config/rs6000/t-rs6000 (EXTRA_GTYPE_DEPS): Set.	2021-08-23 16:00:35 -05:00
Bill Schmidt	34ad198138	rs6000: Incorporate new builtins code into the build machinery 2021-07-27 Bill Schmidt <wschmidt@linux.ibm.com> gcc/ * config.gcc (powerpc--): Add rs6000-builtins.o to extra_objs. config/rs6000/rs6000-gen-builtins.c (main): Close init_file last. * config/rs6000/t-rs6000 (rs6000-gen-builtins.o): New target. (rbtree.o): Likewise. (rs6000-gen-builtins): Likewise. (rs6000-builtins.c): Likewise. (rs6000-builtins.h): Likewise. (rs6000.o): Add dependency. (EXTRA_HEADERS): Add rs6000-vecdefines.h. (rs6000-vecdefines.h): New target. (rs6000-builtins.o): Likewise. (rs6000-call.o): Add rs6000-builtins.h as a dependency. (rs6000-c.o): Likewise.	2021-08-23 16:00:32 -05:00
Bill Schmidt	30c335ac44	rs6000: Avoid buffer overruns 2021-08-19 Bill Schmidt <wschmidt@linux.ibm.com> gcc/ PR target/101830 * config/rs6000/rs6000-gen-builtins.c (consume_whitespace): Diagnose buffer overrun. (safe_inc_pos): Fix overrun detection. (match_identifier): Diagnose buffer overrun. (match_integer): Likewise. (match_to_right_bracket): Likewise.	2021-08-23 16:00:26 -05:00
David Malcolm	3d654ca3f4	analyzer: fix ICE with NULL change.m_expr [PR101875] gcc/analyzer/ChangeLog: PR analyzer/101875 * sm-file.cc (file_diagnostic::describe_state_change): Handle change.m_expr being NULL. gcc/testsuite/ChangeLog: PR analyzer/101875 * gcc.dg/analyzer/pr101875.c: New test.	2021-08-23 14:11:58 -04:00
David Malcolm	4b821c7efb	analyzer: fix ICE when failing to reconstruct a fn ptr [PR101837] gcc/analyzer/ChangeLog: PR analyzer/101837 * analyzer.cc (maybe_reconstruct_from_def_stmt): Bail if fn is NULL, and assert that it's non-NULL before passing it to build_call_array_loc. gcc/testsuite/ChangeLog: PR analyzer/101837 * gcc.dg/analyzer/pr101837.c: New test.	2021-08-23 14:09:44 -04:00
David Malcolm	e82e0f149b	analyzer: assume that POINTER_PLUS_EXPR of non-NULL is non-NULL [PR101962] gcc/analyzer/ChangeLog: PR analyzer/101962 * region-model.cc (region_model::eval_condition_without_cm): Refactor comparison against zero, adding a check for POINTER_PLUS_EXPR of non-NULL. gcc/testsuite/ChangeLog: PR analyzer/101962 * gcc.dg/analyzer/data-model-23.c: New test. * gcc.dg/analyzer/pr101962.c: New test.	2021-08-23 14:07:39 -04:00
David Malcolm	4892b30874	analyzer: fix uninit false positive on overlapping bindings gcc/analyzer/ChangeLog: * store.cc (bit_range::intersects_p): New overload. (bit_range::operator-): New. (binding_cluster::maybe_get_compound_binding): Handle the partial overlap case. (selftest::test_bit_range_intersects_p): Add test coverage for new overload of bit_range::intersects_p. * store.h (bit_range::intersects_p): New overload. (bit_range::operator-): New. gcc/testsuite/ChangeLog: * gcc.dg/analyzer/data-model-22.c: New test. * gcc.dg/analyzer/uninit-6.c: New test. * gcc.dg/analyzer/uninit-6b.c: New test.	2021-08-23 14:01:01 -04:00
Iain Sandoe	38757aa887	libiberty, Darwin: Fix a build warning. r12-3005-g220c410162ebece4f missed a cast for the set_32 call. Fixed thus. Signed-off-by: Iain Sandoe <iain@sandoe.co.uk> libiberty/ChangeLog: * simple-object-mach-o.c (simple_object_mach_o_write_segment): Cast the first argument to set_32 as needed.	2021-08-23 17:37:41 +01:00
Jan Hubicka	6a64964212	Avoid redundant entries in modref access lists. In PR101296 Richard noticed that modref is giving up on analysis in milc by hitting --param=modref-max-accesses limit. While cleaning up original modref patch I removed code that tried to do smart things while merging accesses because it had bugs and wanted to reimplement it later which I later forgot. This patch adds logic that avoids adding access and its subaccess to the list which is just waste of memory and compile time. Incrementally I will add logic merging the ranges. gcc/ChangeLog: 2021-08-23 Jan Hubicka <hubicka@ucw.cz> * ipa-modref-tree.h (modref_access_node::range_info_useful_p): Improve range compare. (modref_access_node::contains): New member function. (modref_access_node::search): Remove. (modref_access_node::insert): Be smarter about subaccesses. gcc/testsuite/ChangeLog: 2021-08-23 Jan Hubicka <hubicka@ucw.cz> * gcc.dg/tree-ssa/modref-7.c: New test.	2021-08-23 17:56:51 +02:00
Thomas Schwinge	29c355f76c	Add 'libgomp.c/address-space-1.c' Intel MIC (emulated) offloading execution failure remains to be analyzed. libgomp/ * testsuite/libgomp.c/address-space-1.c: New file. Co-authored-by: Jakub Jelinek <jakub@redhat.com>	2021-08-23 17:46:08 +02:00
Thomas Schwinge	bb75b22aba	Allow matching Intel MIC in OpenMP 'declare variant' ..., and use that to improve XFAILing for Intel MIC offloading execution instead of compilation in 'libgomp.c-c++-common/target-45.c', 'libgomp.fortran/target10.f90'. gcc/ * config/i386/i386-options.c (ix86_omp_device_kind_arch_isa) <omp_device_arch> [ACCEL_COMPILER]: Match "intel_mic". * config/i386/t-omp-device (omp-device-properties-i386) <arch>: Add "intel_mic". libgomp/ * testsuite/lib/libgomp.exp (check_effective_target_offload_target_intelmic): Remove 'proc'. (check_effective_target_offload_device_intel_mic): New 'proc'. * testsuite/libgomp.c-c++-common/on_device_arch.h (device_arch_intel_mic, on_device_arch_intel_mic): New. * testsuite/libgomp.c-c++-common/target-45.c: Use that for 'dg-xfail-run-if'. * testsuite/libgomp.fortran/target10.f90: Likewise.	2021-08-23 17:45:40 +02:00
Jonathan Wakely	1a129376bb	libstdc++: Add default template argument to basic_istream_view The standard shows this default template argument in the <ranges> synopsis, but it was missing in libstdc++. libstdc++-v3/ChangeLog: * include/std/ranges (basic_istream_view): Add default template argument. * testsuite/std/ranges/istream_view.cc: Check it.	2021-08-23 16:17:10 +01:00
Jeff Law	fedadb60b6	Add tailcall/sibcall support to the H8 gcc/ * config/h8300/h8300-protos.h (h8300_expand_epilogue): Add new argument. * config/h8300/jumpcall.md (call, call_value): Restrict to !SIBLING_CALL_P cases. (subcall, sibcall_value): New patterns & expanders. * config/h8300/proepi.md (epilogue): Pass new argument to h8300_expand_epilogue. (sibcall_epilogue): New expander. * config/h8300/h8300.c (h8300_expand_epilogue): Handle sibcall epilogues too. (h8300_ok_for_sibcall_p): New function. (TARGET_FUNCTION_OK_FOR_SIBCALL): define.	2021-08-23 10:37:20 -04:00
Roger Sayle	89ff4f027b	[Committed] Restore build on !TARGET_TRULY_NOOP_TRUNCATION targets My sincere apologies to everyone, but especially Andrew Pinski who warned me in advance that TRULY_NOOP_TRUNCATION results in different code paths/optimizations on some targets. This restores the build on nvptx-none (and presumably others) where mysteriously (truncate:QI (reg:QI)) fails to be simplified to (reg:QI), which is expected (everywhere) in my recently added self-tests. 2021-08-23 Roger Sayle <roger@nextmovesoftware.com> gcc/ChangeLog * simplify-rtx.c (simplify_unary_operation_1): [TRUNCATE]: Handle case where the operand is already the desired mode.	2021-08-23 15:35:05 +01:00
Richard Biener	0230e69a3f	ipa/97565 - fix IPA PTA body availability check Looks like the existing check using has_gimple_body_p isn't enough at LTRANS time but I need to check in_other_partition as well. 2021-08-23 Richard Biener <rguenther@suse.de> PR ipa/97565 * tree-ssa-structalias.c (ipa_pta_execute): Check in_other_partition in addition to has_gimple_body. * g++.dg/lto/pr97565_0.C: New testcase. * g++.dg/lto/pr97565_1.C: Likewise.	2021-08-23 16:30:16 +02:00
Jan Hubicka	39baa886bc	Fix template in g++.dg/tree-ssa/modref-1.C gcc/testsuite/ChangeLog: * g++.dg/tree-ssa/modref-1.C: Fix template.	2021-08-23 16:20:09 +02:00
Jan Hubicka	5bd4ab9166	Fix previous ipa-modref patch gcc/ChangeLog: PR middle-end/101949 * ipa-modref.c (analyze_ssa_name_flags): Fix merging of EAF_NOCLOBBER	2021-08-23 16:16:25 +02:00
Jonathan Wakely	da6ce35106	libstdc++: Use __builtin_expect in __dynamic_cast The null pointer check is never needed for correct code, only to gracefully handle undefined cases. Add __builtin_expect to be sure that we don't pessimize the valid uses. libstdc++-v3/ChangeLog: * libsupc++/dyncast.cc (__dynamic_cast): Add __builtin_expect to precondition check.	2021-08-23 14:45:00 +01:00
Jonathan Wakely	bc97e736a5	libstdc++: Make permissions_are_testable function inline [PR90787] This function should be inline, so that's it's not emitted in tests that don't use it, to avoid undefined references to geteuid(). libstdc++-v3/ChangeLog: PR libstdc++/90787 * testsuite/util/testsuite_fs.h (permissions_are_testable): Define as inline.	2021-08-23 14:44:48 +01:00
Martin Liska	3eb377b437	docs: Fix -mpic-data-is-text-relative option placement. gcc/ChangeLog: * doc/invoke.texi: Put the option out of -mxl-mode-app-model table.	2021-08-23 15:40:15 +02:00
Tobias Burnus	57a9e63c96	Fortran/OpenMP: Improve duplicate errors gcc/fortran/ChangeLog: * openmp.c (gfc_match_dupl_check, gfc_match_dupl_memorder, gfc_match_dupl_atomic): New. (gfc_match_omp_clauses): Use them; remove duplicate 'release'/'relaxed' clause matching; improve error dignostic for 'default'. gcc/testsuite/ChangeLog: * gfortran.dg/goacc/asyncwait-1.f95: Update dg-error. * gfortran.dg/goacc/default-2.f: Update dg-error. * gfortran.dg/goacc/enter-exit-data.f95: Update dg-error. * gfortran.dg/goacc/if.f95: Update dg-error. * gfortran.dg/goacc/parallel-kernels-clauses.f95: Update dg-error. * gfortran.dg/goacc/routine-6.f90: Update dg-error. * gfortran.dg/goacc/sie.f95: Update dg-error. * gfortran.dg/goacc/update-if_present-2.f90: Update dg-error. * gfortran.dg/gomp/cancel-2.f90: Update dg-error. * gfortran.dg/gomp/declare-simd-1.f90: Update dg-error. * gfortran.dg/gomp/error-3.f90: Update dg-error. * gfortran.dg/gomp/loop-2.f90: Update dg-error. * gfortran.dg/gomp/masked-2.f90: Update dg-error.	2021-08-23 15:29:49 +02:00
Tobias Burnus	d4de7e32ef	Fortran/OpenMP: strict modifier on grainsize/num_tasks This patch adds support for the 'strict' modifier on grainsize/num_tasks clauses, an OpenMP 5.1 feature supported in C/C++ since commit r12-3066-g3bc75533d1f87f0617be6c1af98804f9127ec637 gcc/fortran/ChangeLog: * dump-parse-tree.c (show_omp_clauses): Handle 'strict' modifier on grainsize/num_tasks * gfortran.h (gfc_omp_clauses): Add grainsize_strict and num_tasks_strict. * trans-openmp.c (gfc_trans_omp_clauses, gfc_split_omp_clauses): Handle 'strict' modifier on grainsize/num_tasks. * openmp.c (gfc_match_omp_clauses): Likewise. libgomp/ChangeLog: * testsuite/libgomp.fortran/taskloop-4-a.f90: New test. * testsuite/libgomp.fortran/taskloop-4.f90: New test. * testsuite/libgomp.fortran/taskloop-5-a.f90: New test. * testsuite/libgomp.fortran/taskloop-5.f90: New test.	2021-08-23 15:15:30 +02:00
Richard Biener	12dc8ab983	Fix scalar costing issue introduced by PR84512 fix. This fixes double-scaling of the inner loop scalar cost caused by routing the scalar costs through the add_stmt_cost hook and using vect_body as the location. The issue makes almost every outer loop vectorization profitable. 2021-08-23 Richard Biener <rguenther@suse.de> * tree-vect-loop.c (vect_compute_single_scalar_iteration_cost): Properly scale the inner loop cost only once.	2021-08-23 14:28:54 +02:00
Ankur Saini	537878152d	analyzer: Fix PR analyzer/102020 2021-08-23 Ankur Saini <arsenic@sourceware.org> gcc/analyzer/ChangeLog: PR analyzer/102020 * diagnostic-manager.cc (diagnostic_manager::prune_for_sm_diagnostic)<case EK_CALL_EDGE>: Fix typo. gcc/testsuite/ChangeLog: PR analyzer/102020 * gcc.dg/analyzer/malloc-callbacks.c : Fix faulty test.	2021-08-23 17:19:18 +05:30
Roger Sayle	e7721590e0	Improved handling of division/modulus in bit CCP. This patch implements support for TRUNC_MOD_EXPR and TRUNC_DIV_EXPR in tree-ssa's bit CCP pass. This is mostly for completeness, as the VRP pass already provides better bounds for these operations, but seeing mask values of all_ones in my debugging/instrumentation logs seemed overly pessimistic. With this patch, the expression X%10 has a nonzero bits of 0x0f (for unsigned X), likewise (X&1)/3 has a known value of zero, and (X&3)/3 has a nonzero bits mask of 0x1. 2021-08-23 Roger Sayle <roger@nextmovesoftware.com> gcc/ChangeLog * tree-ssa-ccp.c (bit_value_binop) [TRUNC_MOD_EXPR, TRUNC_DIV_EXPR]: Provide bounds for unsigned (and signed with non-negative operands) division and modulus.	2021-08-23 12:44:37 +01:00
Roger Sayle	7e5f9ead16	Simplify (truncate:QI (subreg:SI (reg:QI x))) to (reg:QI x) Whilst working on a backend patch, I noticed that the middle-end's RTL optimizers weren't simplifying a truncation of a paradoxical subreg extension, though it does transform closely related (more complex) expressions. The main (first) part of this patch implements this simplification, reusing much of the logic already in place. I briefly considered suggesting that it's difficult to provide a new testcase for this change, but then realized the reviewer's response would be that this type of transformation should be self-tested in simplify-rtx, so this patch adds a bunch of tests that integer extensions and truncations are simplified as expected. No good deed goes unpunished and I was equally surprised to see that we don't currently simplify/check/defend (zero_extend:SI (reg:SI)), i.e. useless no-op extensions to the same mode. So I've added some logic to simplify (or more accurately prevent us generating dubious RTL for) those. 2021-08-23 Roger Sayle <roger@nextmovesoftware.com> gcc/ChangeLog * simplify-rtx.c (simplify_truncation): Generalize simplification of (truncate:A (subreg:B X)). (simplify_unary_operation_1) [FLOAT_TRUNCATE, FLOAT_EXTEND, SIGN_EXTEND, ZERO_EXTEND]: Handle cases where the operand already has the desired machine mode. (test_scalar_int_ops): Add tests that useless extensions and truncations are optimized away. (test_scalar_int_ext_ops): New self-test function to confirm that truncations of extensions are correctly simplified. (test_scalar_int_ext_ops2): New self-test function to check truncations of truncations, extensions of extensions, and truncations of extensions. (test_scalar_ops): Call the above two functions with a representative sampling of integer machine modes.	2021-08-23 12:40:10 +01:00
Roger Sayle	1d24402024	Fold sign of LSHIFT_EXPR to eliminate no-op conversions. This short patch teaches fold that it is "safe" to change the sign of a left shift, to reduce the number of type conversions in gimple. As an example: unsigned int foo(unsigned int i) { return (int)i << 8; } is currently optimized to: unsigned int foo (unsigned int i) { int i.0_1; int _2; unsigned int _4; <bb 2> [local count: 1073741824]: i.0_1 = (int) i_3(D); _2 = i.0_1 << 8; _4 = (unsigned int) _2; return _4; } with this patch, this now becomes: unsigned int foo (unsigned int i) { unsigned int _2; <bb 2> [local count: 1073741824]: _2 = i_1(D) << 8; return _2; } which generates exactly the same assembly language. Aside from the reduced memory usage, the real benefit is that no-op conversions tend to interfere with many folding optimizations. For example, unsigned int bar(unsigned char i) { return (i ^ (i<<16)) \| (i<<8); } currently gets (tangled in conversions and) optimized to: unsigned int bar (unsigned char i) { unsigned int _1; unsigned int _2; int _3; int _4; unsigned int _6; unsigned int _8; <bb 2> [local count: 1073741824]: _1 = (unsigned int) i_5(D); _2 = _1 * 65537; _3 = (int) i_5(D); _4 = _3 << 8; _8 = (unsigned int) _4; _6 = _2 \| _8; return _6; } but with this patch, bar now optimizes down to: unsigned int bar(unsigned char i) { unsigned int _1; unsigned int _4; <bb 2> [local count: 1073741824]: _1 = (unsigned int) i_3(D); _4 = _1 * 65793; return _4; } 2021-08-23 Roger Sayle <roger@nextmovesoftware.com> gcc/ChangeLog * match.pd (shift transformations): Change the sign of an LSHIFT_EXPR if it reduces the number of explicit conversions. gcc/testsuite/ChangeLog * gcc.dg/fold-convlshift-1.c: New test case. * gcc.dg/fold-convlshift-2.c: New test case.	2021-08-23 12:37:04 +01:00
Jakub Jelinek	b320edc0c2	bswap: Recognize (int) __builtin_bswap64 (arg) idioms or __builtin_bswap?? (arg) & mask [PR86723] The following patch recognizes in the bswap pass (only there for now, haven't done it for store merging pass yet) code sequences that can be handled by (int32) __builtin_bswap64 (arg), i.e. where we have 0x05060708 n->n with 64-bit non-memory argument (if it is memory, we can just load the 32-bit at 4 bytes into the address and n->n would be 0x01020304; and only 64 -> 32 bit, because 64 -> 16 bit or 32 -> 16 bit would mean only two bytes in the result and probably not worth it), and furthermore the case where we have in the 0x0102030405060708 etc. numbers some bytes 0 (i.e. known to contain zeros rather than source bytes), as long as we have at least two original bytes in the right positions (and no unknown bytes). This can be handled by __builtin_bswap64 (arg) & 0xff0000ffffff00ffULL etc. The latter change is the reason why counting the bswap messages doesn't work too well in optimize-bswap* tests anymore, while the pass iterates from end of basic block towards start, it will often match both the bswap at the end and some of the earlier bswaps with some masks (not a problem generally, we'll just DCE it away whenever possible). The pass right now doesn't handle __builtin_bswap* calls in the pattern matching (which is the reason why it operates backwards), but it uses FOR_EACH_BB_FN (bb, fun) order of handling blocks and matched sequences can span multiple blocks, so I was worried about cases like: void bar (unsigned long long); unsigned long long foo (unsigned long long value, int x) { unsigned long long tmp = (((value & 0x00000000000000ffull) << 56) \| ((value & 0x000000000000ff00ull) << 40) \| ((value & 0x00000000ff000000ull) << 8)); if (x) bar (tmp); return (tmp \| ((value & 0x000000ff00000000ull) >> 8) \| ((value & 0x0000ff0000000000ull) >> 24) \| ((value & 0x0000000000ff0000ull) << 24) \| ((value & 0x00ff000000000000ull) >> 40) \| ((value & 0xff00000000000000ull) >> 56)); } but it seems we handle even that fine, while bb2 ending in GIMPLE_COND is processed first, we recognize there a __builtin_bswap64 (value) & mask1, in the last bb we recognize tmp \| (__builtin_bswap64 (value) & mask2) and PRE optimizes that into t = __builtin_bswap64 (value); tmp = t & mask1; in the first bb and return t; in the last one. 2021-08-23 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/86723 * gimple-ssa-store-merging.c (find_bswap_or_nop_finalize): Add cast64_to_32 argument, set cast64_to_32 to false, unless n is non-memory permutation of 64-bit src which only has bytes of 0 or [5..8] and n->range is 4. (find_bswap_or_nop): Add cast64_to_32 and mask arguments, adjust find_bswap_or_nop_finalize caller, support bswap with some bytes zeroed, as long as at least two bytes are not zeroed. (bswap_replace): Add mask argument and handle masking of bswap result. (maybe_optimize_vector_constructor): Adjust find_bswap_or_nop caller, punt if cast64_to_32 or mask is not all ones. (pass_optimize_bswap::execute): Adjust find_bswap_or_nop_finalize caller, for now punt if cast64_to_32. gcc.dg/pr86723.c: New test. * gcc.target/i386/pr86723.c: New test. * gcc.dg/optimize-bswapdi-1.c: Use -fdump-tree-optimized instead of -fdump-tree-bswap and scan for number of __builtin_bswap64 calls. * gcc.dg/optimize-bswapdi-2.c: Likewise. * gcc.dg/optimize-bswapsi-1.c: Use -fdump-tree-optimized instead of -fdump-tree-bswap and scan for number of __builtin_bswap32 calls. * gcc.dg/optimize-bswapsi-5.c: Likewise. * gcc.dg/optimize-bswapsi-3.c: Likewise. Expect one __builtin_bswap32 call instead of zero.	2021-08-23 11:54:03 +02:00
Richard Biener	ad665deeaf	tree-optimization/79334 - avoid PRE of possibly trapping array-ref This replicates tree-eh.c in_array_bound_p into VNs vn_reference_may_trap to fix hoisting of a possibly trapping ARRAY_REF across a call that might not return. 2021-08-23 Richard Biener <rguenther@suse.de> PR tree-optimization/79334 * tree-ssa-sccvn.c (copy_reference_ops_from_ref): Record a type also for COMPONENT_REFs. (vn_reference_may_trap): Check ARRAY_REF with constant index against the array domain. * gcc.dg/torture/pr79334-0.c: New testcase. * gcc.dg/torture/pr79334-1.c: Likewise.	2021-08-23 11:53:15 +02:00

1 2 3 4 5 ...

187529 Commits