OpenE2K/gcc - gcc - Expired Mentality Git

Commit Graph

Author	SHA1	Message	Date
Harald Anlauf	9164caf25c	PR fortran/96711 - ICE with NINT() for integer(16) result When rounding a real to the nearest integer, temporarily convert the real argument to a longer real kind when the result is of type/kind integer(16). gcc/fortran/ChangeLog: * trans-intrinsic.c (build_round_expr): Use temporary with appropriate kind for conversion before rounding to nearest integer when the result precision is 128 bits. gcc/testsuite/ChangeLog: * gfortran.dg/pr96711.f90: New test.	2020-09-07 21:42:30 +02:00
Richard Sandiford	6001db79c4	lra: Avoid cycling on certain subreg reloads [PR96796] This PR is about LRA cycling for a reload of the form: ---------------------------------------------------------------------------- Changing pseudo 196 in operand 1 of insn 103 on equiv [r105:DI0x8+r140:DI] Creating newreg=287, assigning class ALL_REGS to slow/invalid mem r287 Creating newreg=288, assigning class ALL_REGS to slow/invalid mem r288 103: r203:SI=r288:SI<<0x1+r196:DI#0 REG_DEAD r196:DI Inserting slow/invalid mem reload before: 316: r287:DI=[r105:DI0x8+r140:DI] 317: r288:SI=r287:DI#0 ---------------------------------------------------------------------------- The problem is with r287. We rightly give it a broad starting class of POINTER_AND_FP_REGS (reduced from ALL_REGS by preferred_reload_class). However, we never make forward progress towards narrowing it down to a specific choice of class (POINTER_REGS or FP_REGS). I think in practice we rely on two things to narrow a reload pseudo's class down to a specific choice: (1) a restricted class is specified when the pseudo is created This happens for input address reloads, where the class is taken from the target's chosen base register class. It also happens for simple REG reloads, where the class is taken from the chosen alternative's constraints. (2) uses of the reload pseudo as a direct input operand In this case get_reload_reg tries to reuse the existing register and narrow its class, instead of creating a new reload pseudo. However, neither occurs here. As described above, r287 rightly starts out with a wide choice of class, ultimately derived from ALL_REGS, so we don't get (1). And as the comments in the PR explain, r287 is never used as an input reload, only the subreg is, so we don't get (2): ---------------------------------------------------------------------------- Choosing alt 13 in insn 317: (0) r (1) w {movsi_aarch64} Creating newreg=291, assigning class FP_REGS to r291 317: r288:SI=r291:SI Inserting insn reload before: 320: r291:SI=r287:DI#0 ---------------------------------------------------------------------------- IMO, in this case we should rely on the reload of r316 to narrow down the class of r278. Currently we do: ---------------------------------------------------------------------------- Choosing alt 7 in insn 316: (0) r (1) m {movdi_aarch64} Creating newreg=289 from oldreg=287, assigning class GENERAL_REGS to r289 316: r289:DI=[r105:DI0x8+r140:DI] Inserting insn reload after: 318: r287:DI=r289:DI --------------------------------------------------- i.e. we create a new pseudo register r289 and give that* pseudo GENERAL_REGS instead. This is because get_reload_reg only narrows down the existing class for OP_IN and OP_INOUT, not OP_OUT. But if we have a reload pseudo in a reload instruction and have chosen a specific class for the reload pseudo, I think we should simply install it for OP_OUT reloads too, if the class is a subset of the existing class. We will need to pick such a register whatever happens (for r289 in the example above). And as explained in the PR, doing this actually avoids an unnecessary move via the FP registers too. The patch is quite aggressive in that it does this for all reload pseudos in all reload instructions. I wondered about reusing the condition for a reload move in in_class_p: INSN_UID (curr_insn) >= new_insn_uid_start && curr_insn_set != NULL && ((OBJECT_P (SET_SRC (curr_insn_set)) && ! CONSTANT_P (SET_SRC (curr_insn_set))) \|\| (GET_CODE (SET_SRC (curr_insn_set)) == SUBREG && OBJECT_P (SUBREG_REG (SET_SRC (curr_insn_set))) && ! CONSTANT_P (SUBREG_REG (SET_SRC (curr_insn_set))))))) but I can't really justify that on first principles. I think we should apply the rule consistently until we have a specific reason for doing otherwise. gcc/ PR rtl-optimization/96796 * lra-constraints.c (in_class_p): Add a default-false allow_all_reload_class_changes_p parameter. Do not treat reload moves specially when the parameter is true. (get_reload_reg): Try to narrow the class of an existing OP_OUT reload if we're reloading a reload pseudo in a reload instruction. gcc/testsuite/ PR rtl-optimization/96796 * gcc.c-torture/compile/pr96796.c: New test.	2020-09-07 20:15:36 +01:00
Jonathan Wakely	ec5096f48b	libstdc++: Simplify chrono::duration::_S_gcd We can simplify this constexpr function further because we know that period::num >= 1 and period::den >= 1 so only the remainder can ever be zero. libstdc++-v3/ChangeLog: * include/std/chrono (duration::_S_gcd): Use invariant that neither value is zero initially.	2020-09-07 20:09:17 +01:00
Jonathan Wakely	00ffe73007	libstdc++: Simplify constraints for semiregular-box [LWG 3477] libstdc++-v3/ChangeLog: * include/std/ranges (__box): Simplify constraints as per LWG 3477.	2020-09-07 20:09:17 +01:00
Andrea Corallo	e147bb0faa	vec: Revert "dead code removal in tree-vect-loop.c" and add a comment. gcc/ChangeLog 2020-09-07 Andrea Corallo <andrea.corallo@arm.com> * tree-vect-loop.c (vect_estimate_min_profitable_iters): Revert dead-code removal introduced by `09fa6acd8d` + add a comment to clarify.	2020-09-07 19:49:25 +02:00
Jozef Lawrynowicz	016b190036	doc: Update documentation on MODE_PARTIAL_INT subregs In `d8487c949a`, MODE_PARTIAL_INT modes were changed from having an unknown number of undefined bits, to having a known number of undefined bits, however the documentation on using SUBREG expressions with MODE_PARTIAL_INT modes was not updated to reflect this. gcc/ChangeLog: * doc/rtl.texi (subreg): Fix documentation to state there is a known number of undefined bits in regs and subregs of MODE_PARTIAL_INT modes.	2020-09-07 17:54:23 +01:00
Jozef Lawrynowicz	7f87e44669	MSP430: Don't override default ISA when MCU name is unrecognized 430X is the default ISA under normal operation, so even when the MCU name passed to -mmcu= is unrecognized, it should not be overriden. gcc/ChangeLog: * config/msp430/msp430.c (msp430_option_override): Don't set the ISA to 430 when the MCU is unrecognized. gcc/testsuite/ChangeLog: * gcc.target/msp430/430x-default-isa.c: New test.	2020-09-07 17:35:04 +01:00
Iain Sandoe	84e9fc470f	Darwin, testsuite : Update pubtypes tests. Recent changes in debug output have resulted in a change in the length of the pub types info. This updates the tests to reflect the new length. gcc/testsuite/ChangeLog: * gcc.dg/pubtypes-2.c: Amend Pub Info Length. * gcc.dg/pubtypes-3.c: Likewise. * gcc.dg/pubtypes-4.c: Likewise.	2020-09-07 17:08:47 +01:00
Iain Sandoe	2e746cebd9	Darwin : Update libc function availability. Darwin libc has sincos from 10.9 (darwin13) onwards. gcc/ChangeLog: * config/darwin.c (darwin_libc_has_function): Report sincos available from 10.9.	2020-09-07 17:06:52 +01:00
Alex Coplan	2f8ae301f6	aarch64: Remove redundant mult patterns Following on from the previous commit to fix up the syntax for add/sub/adds/subs and friends with a sign/zero-extended operand, this patch removes the "mult" variants of these patterns which are all redundant. This patch removes the following patterns from the AArch64 backend: adds_mul_imm_<mode> subs_mul_imm_<mode> adds_<optab><mode>_multp2 subs_<optab><mode>_multp2 add_mul_imm_<mode> add_<optab><ALLX:mode>_mult_<GPI:mode> add_<optab><SHORT:mode>_mult_si_uxtw add_<optab><mode>_multp2 add_<optab>si_multp2_uxtw add_uxt<mode>_multp2 add_uxtsi_multp2_uxtw sub_mul_imm_<mode> sub_mul_imm_si_uxtw sub_<optab><mode>_multp2 sub_<optab>si_multp2_uxtw sub_uxt<mode>_multp2 sub_uxtsi_multp2_uxtw neg_mul_imm_<mode>2 neg_mul_imm_si2_uxtw Together with the following predicates which were used only by these patterns: aarch64_pwr_imm3 aarch64_pwr_2_si aarch64_pwr_2_di These patterns are all redundant since multiplications by powers of two should be represented as shfits outside a (mem). --- gcc/ChangeLog: config/aarch64/aarch64.md (adds_mul_imm_<mode>): Delete. (subs_mul_imm_<mode>): Delete. (adds_<optab><mode>_multp2): Delete. (subs_<optab><mode>_multp2): Delete. (add_mul_imm_<mode>): Delete. (add_<optab><ALLX:mode>_mult_<GPI:mode>): Delete. (add_<optab><SHORT:mode>_mult_si_uxtw): Delete. (add_<optab><mode>_multp2): Delete. (add_<optab>si_multp2_uxtw): Delete. (add_uxt<mode>_multp2): Delete. (add_uxtsi_multp2_uxtw): Delete. (sub_mul_imm_<mode>): Delete. (sub_mul_imm_si_uxtw): Delete. (sub_<optab><mode>_multp2): Delete. (sub_<optab>si_multp2_uxtw): Delete. (sub_uxt<mode>_multp2): Delete. (sub_uxtsi_multp2_uxtw): Delete. (neg_mul_imm_<mode>2): Delete. (neg_mul_imm_si2_uxtw): Delete. config/aarch64/predicates.md (aarch64_pwr_imm3): Delete. (aarch64_pwr_2_si): Delete. (aarch64_pwr_2_di): Delete.	2020-09-07 15:24:03 +01:00
Alex Coplan	d4febc75e8	aarch64: Don't emit invalid zero/sign-extend syntax Given the following C function: double f(double p, unsigned x) { return p + x; } prior to this patch, GCC at -O2 would generate: f: add x0, x0, x1, uxtw 3 ret but this add instruction uses architecturally-invalid syntax: the width of the third operand conflicts with the width of the extension specifier. The third operand is only permitted to be an x register when the extension specifier is (u\|s)xtx. This instruction, and analogous insns for adds, sub, subs, and cmp, are rejected by clang, but accepted by binutils. Assembling and disassembling such an insn with binutils gives the architecturally-valid version in the disassembly: 0: 8b214c00 add x0, x0, w1, uxtw #3 This patch fixes several patterns in the AArch64 backend to use the standard syntax as specified in the Arm ARM such that GCC's output can be assembled by assemblers other than GAS. --- gcc/ChangeLog: * config/aarch64/aarch64.md (adds_<optab><ALLX:mode>_<GPI:mode>): Ensure extended operand agrees with width of extension specifier. (subs_<optab><ALLX:mode>_<GPI:mode>): Likewise. (adds_<optab><ALLX:mode>_shift_<GPI:mode>): Likewise. (subs_<optab><ALLX:mode>_shift_<GPI:mode>): Likewise. (add_<optab><ALLX:mode>_<GPI:mode>): Likewise. (add_<optab><ALLX:mode>_shft_<GPI:mode>): Likewise. (add_uxt<mode>_shift2): Likewise. (sub_<optab><ALLX:mode>_<GPI:mode>): Likewise. (sub_<optab><ALLX:mode>_shft_<GPI:mode>): Likewise. (sub_uxt<mode>_shift2): Likewise. (cmp_swp_<optab><ALLX:mode>_reg<GPI:mode>): Likewise. (cmp_swp_<optab><ALLX:mode>_shft_<GPI:mode>): Likewise. gcc/testsuite/ChangeLog: * gcc.target/aarch64/adds3.c: Fix test w.r.t. new syntax. * gcc.target/aarch64/cmp.c: Likewise. * gcc.target/aarch64/subs3.c: Likewise. * gcc.target/aarch64/subsp.c: Likewise. * gcc.target/aarch64/extend-syntax.c: New test.	2020-09-07 15:20:21 +01:00
Richard Biener	931832a5cc	improve SLP vect dumping This adds additional dumping helping in particular basic-block vectorization SLP dump reading plus showing what we actually generate code from. 2020-09-07 Richard Biener <rguenther@suse.de> * tree-vect-slp.c (vect_analyze_slp_instance): Dump stmts we start SLP analysis from, failure and splitting. (vect_schedule_slp): Dump SLP graph entry and root stmt we are about to emit code for.	2020-09-07 14:29:39 +02:00
Martin Storsjö	3fe3efe5c1	gcc: Make strchr return value pointers const This fixes compilation of codepaths for dos-like filesystems with Clang. When built with clang, it treats C input files as C++ when the compiler driver is invoked in C++ mode, triggering errors when the return value of strchr() on a pointer to const is assigned to a pointer to non-const variable. This matches similar variables outside of the ifdefs for dos-like path handling. 2020-09-07 Martin Storsjö <martin@martin.st> gcc/ * dwarf2out.c (file_name_acquire): Make a strchr return value pointer to const. libcpp/ * files.c (remap_filename): Make a strchr return value pointer to const.	2020-09-07 13:20:21 +02:00
Tobias Burnus	2b0df0a6ac	Fortran: Fixes for pointer function call as variable (PR96896) gcc/fortran/ChangeLog: PR fortran/96896 * resolve.c (get_temp_from_expr): Also reset proc_pointer + use_assoc attribute. (resolve_ptr_fcn_assign): Use information from the LHS. gcc/testsuite/ChangeLog: PR fortran/96896 * gfortran.dg/ptr_func_assign_4.f08: Update dg-error. * gfortran.dg/ptr-func-3.f90: New test.	2020-09-07 12:30:11 +02:00
Tom de Vries	c9c87dc958	[libatomic, testsuite] Add missing include in atomic-generic.c When compiling atomic-generic.c from the libatomic testsuite, we run into: ... $ gcc src/libatomic/testsuite/libatomic.c/atomic-generic.c -latomic src/libatomic/testsuite/libatomic.c/atomic-generic.c: In function ‘main’: src/libatomic/testsuite/libatomic.c/atomic-generic.c:31:7: warning: \ implicit declaration of function ‘memcmp’ [-Wimplicit-function-declaration] if (memcmp (&a, &zero, size)) ^~~~~~ ... Fix this by adding the missing string.h include. Tested on x86_64. libatomic/ChangeLog: * testsuite/libatomic.c/atomic-generic.c: Include string.h.	2020-09-07 12:02:05 +02:00
liuhongt	703bc188f4	Adjust testcase. gcc/testsuite/ChangeLog: * gcc.dg/vect/slp-46.c: Add --param vect-epilogues-nomask=0 to void backend interference.	2020-09-07 16:39:25 +08:00
Jakub Jelinek	fea13fcd0d	lto: Stream edge goto_locus [PR94235] The following patch adds streaming of edge goto_locus (both LOCATION_LOCUS and LOCATION_BLOCK from it), the PR shows a testcase (inappropriate for gcc testsuite) where the lack of streaming of goto_locus results in worse debug info. Earlier version of the patch (without the output_function changes) failed miserably, because on the order mismatch - input_function would first input_cfg, then input_eh_regions and then input_bb (all of which now have locations), while output_function used output_eh_regions, then output_bb and then output_cfg. _cfg went to a separate stream... Now, is there a reason why the order is different? If the intent is that the cfg could be read separately from the rest of function or vice versa, alternatively we'd need to clear_line_info (); before output_eh_regions and before/after output_cfg to make them independent. 2020-09-07 Jakub Jelinek <jakub@redhat.com> PR debug/94235 lto-streamer-out.c (output_cfg): Also stream goto_locus for edges. Use bp_pack_var_len_unsigned instead of streamer_write_uhwi to stream e->dest->index and e->flags. (output_function): Call output_cfg before output_ssa_name, rather than after streaming all bbs. * lto-streamer-in.c (input_cfg): Stream in goto_locus for edges. Use bp_unpack_var_len_unsigned instead of streamer_read_uhwi to stream in dest_index and edge_flags.	2020-09-07 09:54:38 +02:00
Richard Biener	095d42feed	code generate live lanes in basic-block vectorization The following adds the capability to code-generate live lanes in basic-block vectorization using lane extracts from vector stmts rather than keeping the original scalar code around for those. This eventually makes previously not profitable vectorizations profitable (the live scalar code was appropriately costed so are the lane extracts now), without considering the cost model this patch doesn't add or remove any basic-block vectorization capabilities. The patch re/ab-uses STMT_VINFO_LIVE_P in basic-block vectorization mode to tell whether a live lane is vectorized or whether it is provided by means of keeping the scalar code live. The patch is a first step towards vectorizing sequences of stmts that do not end up in stores or vector constructors though. Bootstrapped and tested on x86_64-unknown-linux-gnu. 2020-09-04 Richard Biener <rguenther@suse.de> * tree-vectorizer.h (vectorizable_live_operation): Adjust. * tree-vect-loop.c (vectorizable_live_operation): Vectorize live lanes out of basic-block vectorization nodes. * tree-vect-slp.c (vect_bb_slp_mark_live_stmts): New function. (vect_slp_analyze_operations): Analyze live lanes and their vectorization possibility after the whole SLP graph is final. (vect_bb_slp_scalar_cost): Adjust for vectorized live lanes. * tree-vect-stmts.c (can_vectorize_live_stmts): Adjust. (vect_transform_stmt): Call can_vectorize_live_stmts also for basic-block vectorization. * gcc.dg/vect/bb-slp-46.c: New testcase. * gcc.dg/vect/bb-slp-47.c: Likewise. * gcc.dg/vect/bb-slp-32.c: Adjust.	2020-09-07 09:47:36 +02:00
Francois-Xavier Coudert	d30869a8d4	fortran: Fix argument types in derived types procedures gcc/fortran/ChangeLog * trans-types.c (gfc_get_derived_type): Fix argument types.	2020-09-07 09:38:25 +02:00
Francois-Xavier Coudert	a502683de1	fortran: Fix arg types of _gfortran_is_extension_of gcc/fortran/ChangeLog * resolve.c (resolve_select_type): Provide a formal arg list.	2020-09-07 09:37:01 +02:00
liuhongt	995bb851ff	Adjust testcase. gcc/testsuite/ChangeLog: * gcc.target/i386/pr92658-avx512bw-trunc.c: Add -mprefer-vector-width=512 to avoid impact of different default tune which gcc is built with.	2020-09-07 15:26:18 +08:00
GCC Administrator	0fd39e420e	Daily bump.	2020-09-07 00:16:22 +00:00
Francois-Xavier Coudert	23f8b90c40	fortran: Add comment about previous commit gcc/fortran/ChangeLog * trans-types.c (gfc_get_ppc_type): Add comment.	2020-09-06 18:37:05 +02:00
Francois-Xavier Coudert	7c72651a93	fortran: Fix function arg types for class objects gcc/fortran/ChangeLog * trans-types.c (gfc_get_ppc_type): Fix function arg types.	2020-09-06 18:33:43 +02:00
Francois-Xavier Coudert	3489d80fee	fortran: caf_fail_image expects no argument gcc/fortran/ChangeLog PR fortran/96947 * trans-stmt.c (gfc_trans_fail_image): caf_fail_image expects no argument. gcc/testsuite/ChangeLog * gfortran.dg/coarray_fail_st.f90: Adjust test.	2020-09-06 18:29:09 +02:00
GCC Administrator	0dc8050556	Daily bump.	2020-09-06 00:16:20 +00:00
GCC Administrator	bec05c98b9	Daily bump.	2020-09-05 00:16:20 +00:00
Iain Buclaw	f8eabd47ac	d: Fix ICE in create_tmp_var, at gimple-expr.c:482 Array concatenate expressions were creating more SAVE_EXPRs than what was necessary. The internal error itself was the result of a forced temporary being made on a TREE_ADDRESSABLE type. gcc/d/ChangeLog: PR d/96924 * expr.cc (ExprVisitor::visit (CatAssignExp )): Don't force temporaries needlessly. gcc/testsuite/ChangeLog: PR d/96924 gdc.dg/simd13927b.d: Removed. * gdc.dg/pr96924.d: New test.	2020-09-04 23:01:46 +02:00
Jason Merrill	f923c40f9b	c++: Use iloc_sentinel in mark_use. gcc/cp/ChangeLog: * expr.c (mark_use): Use iloc_sentinel.	2020-09-04 13:56:32 -04:00
Richard Biener	46a58c779a	tree-optimization/96920 - another ICE when vectorizing nested cycles This refines the previous fix for PR96698 by re-doing how and where we arrange for setting vectorized cycle PHI backedge values. 2020-09-04 Richard Biener <rguenther@suse.de> PR tree-optimization/96698 PR tree-optimization/96920 * tree-vectorizer.h (loop_vec_info::reduc_latch_defs): Remove. (loop_vec_info::reduc_latch_slp_defs): Likewise. * tree-vect-stmts.c (vect_transform_stmt): Remove vectorized cycle PHI latch code. * tree-vect-loop.c (maybe_set_vectorized_backedge_value): New helper to set vectorized cycle PHI latch values. (vect_transform_loop): Walk over all PHIs again after vectorizing them, calling maybe_set_vectorized_backedge_value. Call maybe_set_vectorized_backedge_value for each vectorized stmt. Remove delayed update code. * tree-vect-slp.c (vect_analyze_slp_instance): Initialize SLP instance reduc_phis member. (vect_schedule_slp): Set vectorized cycle PHI latch values. * gfortran.dg/vect/pr96920.f90: New testcase. * gcc.dg/vect/pr96920.c: Likewise.	2020-09-04 15:42:43 +02:00
Andrea Corallo	09fa6acd8d	vec: dead code removal in tree-vect-loop.c gcc/ChangeLog 2020-09-04 Andrea Corallo <andrea.corallo@arm.com> * tree-vect-loop.c (vect_estimate_min_profitable_iters): Remove dead code as LOOP_VINFO_USING_PARTIAL_VECTORS_P (loop_vinfo) is always verified.	2020-09-04 14:16:52 +02:00
Christophe Lyon	2033a63cbd	arm: Improve immediate generation for thumb-1 with -mpurecode [PR96769] This patch moves the move-immediate splitter after the regular ones so that it has lower precedence, and updates its constraints. For int f3 (void) { return 0x11000000; } int f3_2 (void) { return 0x12345678; } we now generate: * with -O2 -mcpu=cortex-m0 -mpure-code: f3: movs r0, #136 lsls r0, r0, #21 bx lr f3_2: movs r0, #18 lsls r0, r0, #8 adds r0, r0, #52 lsls r0, r0, #8 adds r0, r0, #86 lsls r0, r0, #8 adds r0, r0, #121 bx lr * with -O2 -mcpu=cortex-m23 -mpure-code: f3: movs r0, #136 lsls r0, r0, #21 bx lr f3_2: movw r0, #22136 movt r0, 4660 bx lr 2020-09-04 Christophe Lyon <christophe.lyon@linaro.org> PR target/96769 gcc/ * config/arm/thumb1.md: Move movsi splitter for arm_disable_literal_pool after the other movsi splitters. gcc/testsuite/ * gcc.target/arm/pure-code/pr96769.c: New test.	2020-09-04 11:48:36 +00:00
Aldy Hernandez	c5a6c2237a	rename widest_irange to int_range_max. gcc/ChangeLog: * range-op.cc (range_operator::fold_range): Rename widest_irange to int_range_max. (operator_div::wi_fold): Same. (operator_lshift::op1_range): Same. (operator_rshift::op1_range): Same. (operator_cast::fold_range): Same. (operator_cast::op1_range): Same. (operator_bitwise_and::remove_impossible_ranges): Same. (operator_bitwise_and::op1_range): Same. (operator_abs::op1_range): Same. (range_cast): Same. (widest_irange_tests): Same. (range3_tests): Rename irange3 to int_range3. (int_range_max_tests): Rename from widest_irange_tests. Rename widest_irange to int_range_max. (operator_tests): Rename widest_irange to int_range_max. (range_tests): Same. * tree-vrp.c (find_case_label_range): Same. * value-range.cc (irange::irange_intersect): Same. (irange::invert): Same. * value-range.h: Same.	2020-09-04 12:26:14 +02:00
Richard Biener	fab7764484	tree-optimization/96931 - clear ctrl-altering flag more aggressively The testcase shows that we fail to clear gimple_call_ctrl_altering_p when the last abnormal edge goes away, causing an edge insert to a loop header edge when we have preheaders to split the edge unnecessarily. The following addresses this by more aggressively clearing the flag in cleanup_call_ctrl_altering_flag. 2020-09-04 Richard Biener <rguenther@suse.de> PR tree-optimization/96931 * tree-cfgcleanup.c (cleanup_call_ctrl_altering_flag): If there's a fallthru edge and no abnormal edge the call is no longer control-altering. (cleanup_control_flow_bb): Pass down the BB to cleanup_call_ctrl_altering_flag. * gcc.dg/pr96931.c: New testcase.	2020-09-04 12:22:29 +02:00
Jakub Jelinek	b898878032	lto: Remove stream_input_location_now As discussed yesterday, stream_input_location_now has been used in 3 remaining places. For ERT_MUST_NOT_THROW, I believe the failure_loc location is stable at least until the apply_cache after the bbs are all read, and the locations do not include BLOCK, so we can use normal stream_input_location, and the two input_struct_function_base also shouldn't include BLOCK and are stable at least until that same apply_cache after reading all bbs, so again we can use the location cache. 2020-09-04 Jakub Jelinek <jakub@redhat.com> * lto-streamer.h (stream_input_location_now): Remove declaration. * lto-streamer-in.c (stream_input_location_now): Remove. (input_eh_region, input_struct_function_base): Use stream_input_location instead of stream_input_location_now.	2020-09-04 11:55:13 +02:00
Jakub Jelinek	70d8d9bd93	lto: Ensure we force a change for file/line/column after clear_line_info As discussed yesterday: On the streamer out side, we call clear_line_info in multiple spots which resets the current_* values to something, but on the reader side, we don't have corresponding resets in the same location, just have the stream_* static variables that keep the current values through the entire stream in (so across all the clear_line_info spots in a single LTO object but also across jumping from one LTO object to another one). Now, in an earlier version of my patch it actually broke LTO bootstrap (and a lot of LTO testcases), so for the BLOCK case I've solved it by clear_line_info setting current_block to something that should never appear, which means that in the LTO stream after the clear_line_info spots including the start of the LTO stream we force the block change bit to be set and thus BLOCK to be streamed and therefore stream_block from earlier to be ignored. But for the rest I think that is not the case, so I wonder if we don't sometimes end up with wrong line/column info because of that, or please tell me what prevents that. clear_line_info does: ob->current_file = NULL; ob->current_line = 0; ob->current_col = 0; ob->current_sysp = false; while I think NULL current_file is something that should likely be different from expanded_location (...).file (UNKNOWN_LOCATION/BUILTINS_LOCATION are handled separately and not go through the caching), I think line number 0 can sometimes occur and especially column 0 occurs frequently if we ran out of location_t with columns info. But then we do: bp_pack_value (bp, ob->current_file != xloc.file, 1); bp_pack_value (bp, ob->current_line != xloc.line, 1); bp_pack_value (bp, ob->current_col != xloc.column, 1); and stream the details only if the != is true. If that happens immediately after clear_line_info and e.g. xloc.column is 0, we would stream 0 bit and not stream the actual value, so on read-in it would reuse whatever stream_col etc. were before. Shouldn't we set some ob->current_* new bit that would signal we are immediately past clear_line_info which would force all these != checks to non-zero? Either by oring something into those tests, or perhaps: if (ob->current_reset) { if (xloc.file == NULL) ob->current_file = ""; if (xloc.line == 0) ob->current_line = 1; if (xloc.column == 0) ob->current_column = 1; ob->current_reset = false; } before doing those bp_pack_value calls with a comment, effectively forcing all 6 != comparisons to be true? 2020-09-04 Jakub Jelinek <jakub@redhat.com> * lto-streamer.h (struct output_block): Add reset_locus member. * lto-streamer-out.c (clear_line_info): Set reset_locus to true. (lto_output_location_1): If reset_locus, clear it and ensure current_{file,line,col} is different from xloc members.	2020-09-04 11:53:28 +02:00
David Faust	c3a0f53739	bpf: generate indirect calls for xBPF This patch updates the BPF back end to generate indirect calls via the 'call %reg' instruction when targetting xBPF. Additionally, the BPF ASM_SPEC is updated to pass along -mxbpf to gas, where it is now supported. 2020-09-03 David Faust <david.faust@oracle.com> gcc/ * config/bpf/bpf.h (ASM_SPEC): Pass -mxbpf to gas, if specified. * config/bpf/bpf.c (bpf_output_call): Support indirect calls in xBPF. gcc/testsuite/ * gcc.target/bpf/xbpf-indirect-call-1.c: New test.	2020-09-04 10:18:56 +02:00
Kewen Lin	e1336703f8	test/rs6000: Replace test targets p8 and p9+ This patch is to clean existing rs6000 test targets p8 and p9+ with existing has_arch_pwr8 and has_arch_pwr9 targets combination or only one of them. gcc/testsuite/ChangeLog: * gcc.target/powerpc/pr92398.p9+.c: Replace p9+ with has_arch_pwr9. * gcc.target/powerpc/pr92398.p9-.c: Replace p9+ with has_arch_pwr9, and replace p8 with has_arch_pwr8 && !has_arch_pwr9. * lib/target-supports.exp (check_effective_target_p8): Remove. (check_effective_target_p9+): Remove.	2020-09-03 22:01:56 -05:00
GCC Administrator	6e82b6cfcf	Daily bump.	2020-09-04 00:16:32 +00:00
Martin Jambor	8ad3fc6ca4	sra: Avoid SRAing if there is an aout-of-bounds access (PR 96820) The testcase causes and ICE in the SRA verifier on x86_64 when compiling with -m32 because build_user_friendly_ref_for_offset looks at an out-of-bounds array_ref within an array_ref which accesses an offset which does not fit into a signed 32bit integer and turns it into an array-ref with a negative index. The best thing is probably to bail out early when encountering an out of bounds access to a local stack-allocated aggregate (and let the DSE just delete such statements) which is what the patch does. I also glanced over to the initial candidate vetting routine to make sure the size would fit into HWI and noticed that it uses unsigned variants whereas the rest of SRA operates on signed offsets and sizes (because get_ref_and_extent does) and so changed that for the sake of consistency. These ancient checks operate on sizes of types as opposed to DECLs but I hope that any issues potentially arising from that are basically hypothetical. gcc/ChangeLog: 2020-08-28 Martin Jambor <mjambor@suse.cz> PR tree-optimization/96820 * tree-sra.c (create_access): Disqualify candidates with accesses beyond the end of the original aggregate. (maybe_add_sra_candidate): Check that candidate type size fits signed uhwi for the sake of consistency. gcc/testsuite/ChangeLog: 2020-08-28 Martin Jambor <mjambor@suse.cz> PR tree-optimization/96820 * gcc.dg/tree-ssa/pr96820.c: New test.	2020-09-03 22:43:49 +02:00
Will Schmidt	d8f3474ff8	[PATCH, rs6000] Fix vector long long subtype (PR96139) Hi, This corrects an issue with the powerpc vector long long subtypes. As reported by SjMunroe, when building some code with -Wall, and attempting to print an element of a "long long vector" with a long long printf format string, we will report an error because the vector sub-type was improperly defined as int. When defining a V2DI_type_node we use a TARGET_POWERPC64 ternary to define the V2DI_type_node with "vector long" or "vector long long". We also need to specify the proper sub-type when we define the type. PR target/96139 2020-09-03 Will Schmidt <will_schmidt@vnet.ibm.com> gcc/ChangeLog: * config/rs6000/rs6000-call.c (rs6000_init_builtin): Update V2DI_type_node and unsigned_V2DI_type_node definitions. gcc/testsuite/ChangeLog: * gcc.target/powerpc/pr96139-a.c: New test. * gcc.target/powerpc/pr96139-b.c: New test. * gcc.target/powerpc/pr96139-c.c: New test.	2020-09-03 15:05:59 -05:00
Jakub Jelinek	ba6730bd18	c++: Fix another PCH hash_map issue [PR96901] The recent libstdc++ changes caused lots of libstdc++-v3 tests FAILs on i686-linux, all of them in the same spot during constexpr evaluation of a recursive _S_gcd call. The problem is yet another hash_map that used the default hasing of tree keys through pointer hashing which is preserved across PCH write/read. During PCH handling, the addresses of GC objects are changed, which means that the hash values of the keys in such hash tables change without those hash tables being rehashed. Which in the fundef_copies_table case usually means we just don't find a copy of a FUNCTION_DECL body for recursive uses and start from scratch. But when the hash table keeps growing, the "dead" elements in the hash table can sometimes reappear and break things. In particular what I saw under the debugger is when the fundef_copies_table hash map has been used on the outer _S_gcd call, it didn't find an entry for it, so returned a slot with slot == NULL, which is treated as that the function itself is used directly (i.e. no recursion), but that addition of a hash table slot caused the recursive _S_gcd call to actually find something in the hash table, unfortunately not the new slot == NULL spot, but a different one from the pre-PCH streaming which contained the returned toplevel (non-recursive) call entry for it, which means that for the recursive _S_gcd call we actually used the same trees as for the outer ones rather than a copy of those, which breaks constexpr evaluation. 2020-09-03 Jakub Jelinek <jakub@redhat.com> PR c++/96901 * tree.h (struct decl_tree_traits): New type. (decl_tree_map): New typedef. * constexpr.c (fundef_copies_table): Change type from hash_map<tree, tree> * to decl_tree_map *.	2020-09-03 21:53:40 +02:00
Harald Anlauf	8eeeecbcc1	PR fortran/96890 - Wrong answer with intrinsic IALL The IALL intrinsic would always return 0 when the DIM and MASK arguments were present since the initial value of repeated BIT-AND operations was set to 0 instead of -1. libgfortran/ChangeLog: * m4/iall.m4: Initial value for result should be -1. * generated/iall_i1.c (miall_i1): Generated. * generated/iall_i16.c (miall_i16): Likewise. * generated/iall_i2.c (miall_i2): Likewise. * generated/iall_i4.c (miall_i4): Likewise. * generated/iall_i8.c (miall_i8): Likewise. gcc/testsuite/ChangeLog: * gfortran.dg/iall_masked.f90: New test.	2020-09-03 20:33:14 +02:00
Marek Polacek	753b4679bc	c++: Fix P0960 in member init list and array [PR92812] This patch nails down the remaining P0960 case in PR92812: struct A { int ar[2]; A(): ar(1, 2) {} // doesn't work without this patch }; Note that when the target object is not of array type, this already works: struct S { int x, y; }; struct A { S s; A(): s(1, 2) { } // OK in C++20 }; because build_new_method_call_1 takes care of the P0960 magic. It proved to be quite hairy. When the ()-list has more than one element, we can always create a CONSTRUCTOR, because the code was previously invalid. But when the ()-list has just one element, it gets all kinds of difficult. As usual, we have to handle a("foo") so as not to wrap the STRING_CST in a CONSTRUCTOR. Always turning x(e) into x{e} would run into trouble as in c++/93790. Another issue was what to do about x({e}): previously, this would trigger "list-initializer for non-class type must not be parenthesized". I figured I'd make this work in C++20, so that given struct S { int x, y; }; you can do S a[2]; [...] A(): a({1, 2}) // initialize a[0] with {1, 2} and a[1] with {} It also turned out that, as an extension, we support compound literals: F (): m((S[1]) { 1, 2 }) so this has to keep working as before. Moreover, make sure not to trigger in compiler-generated code, like =default, where array assignment is allowed. I've factored out a function that turns a TREE_LIST into a CONSTRUCTOR to simplify handling of P0960. paren-init35.C also tests this with vector types. gcc/cp/ChangeLog: PR c++/92812 * cp-tree.h (do_aggregate_paren_init): Declare. * decl.c (do_aggregate_paren_init): New. (grok_reference_init): Use it. (check_initializer): Likewise. * init.c (perform_member_init): Handle initializing an array from a ()-list. Use do_aggregate_paren_init. gcc/testsuite/ChangeLog: PR c++/92812 * g++.dg/cpp0x/constexpr-array23.C: Adjust dg-error. * g++.dg/cpp0x/initlist69.C: Likewise. * g++.dg/diagnostic/mem-init1.C: Likewise. * g++.dg/init/array28.C: Likewise. * g++.dg/cpp2a/paren-init33.C: New test. * g++.dg/cpp2a/paren-init34.C: New test. * g++.dg/cpp2a/paren-init35.C: New test. * g++.old-deja/g++.brendan/crash60.C: Adjust dg-error. * g++.old-deja/g++.law/init10.C: Likewise. * g++.old-deja/g++.other/array3.C: Likewise.	2020-09-03 14:30:06 -04:00
Jakub Jelinek	6641d6d3fe	c++: Disable -frounding-math during manifestly constant evaluation [PR96862] As discussed in the PR, fold-const.c punts on floating point constant evaluation if the result is inexact and -frounding-math is turned on. /* Don't constant fold this floating point operation if the result may dependent upon the run-time rounding mode and flag_rounding_math is set, or if GCC's software emulation is unable to accurately represent the result. / if ((flag_rounding_math \|\| (MODE_COMPOSITE_P (mode) && !flag_unsafe_math_optimizations)) && (inexact \|\| !real_identical (&result, &value))) return NULL_TREE; Jonathan said that we should be evaluating them anyway, e.g. conceptually as if they are done with the default rounding mode before user had a chance to change that, and e.g. in C in initializers it is also ignored. In fact, fold-const.c for C initializers turns off various other options: / Perform constant folding and related simplification of initializer expression EXPR. These behave identically to "fold_buildN" but ignore potential run-time traps and exceptions that fold must preserve. / int saved_signaling_nans = flag_signaling_nans;\ int saved_trapping_math = flag_trapping_math;\ int saved_rounding_math = flag_rounding_math;\ int saved_trapv = flag_trapv;\ int saved_folding_initializer = folding_initializer;\ flag_signaling_nans = 0;\ flag_trapping_math = 0;\ flag_rounding_math = 0;\ flag_trapv = 0;\ folding_initializer = 1; flag_signaling_nans = saved_signaling_nans;\ flag_trapping_math = saved_trapping_math;\ flag_rounding_math = saved_rounding_math;\ flag_trapv = saved_trapv;\ folding_initializer = saved_folding_initializer; So, shall cxx_eval_outermost_constant_expr instead turn off all those options (then warning_sentinel wouldn't be the right thing to use, but given the 8 or how many return stmts in cxx_eval_outermost_constant_expr, we'd need a RAII class for this. Not sure about the folding_initializer, that one is affecting complex multiplication and division constant evaluation somehow. 2020-09-03 Jakub Jelinek <jakub@redhat.com> PR c++/96862 constexpr.c (cxx_eval_outermost_constant_expr): Temporarily disable flag_rounding_math during manifestly constant evaluation. * g++.dg/cpp1z/constexpr-96862.C: New test.	2020-09-03 20:11:43 +02:00
Jonathan Wakely	032a4b42cc	libstdc++: Add workaround for weird std::tuple error [PR 96592] This "fix" makes no sense, but it avoids an error from G++ about std::is_constructible being incomplete. The real problem is elsewhere, but this "fixes" the regression for now. libstdc++-v3/ChangeLog: PR libstdc++/96592 * include/std/tuple (_TupleConstraints<true, T...>): Use alternative is_constructible instead of std::is_constructible. * testsuite/20_util/tuple/cons/96592.cc: New test.	2020-09-03 16:26:16 +01:00
Jonathan Wakely	3c21913415	libstdc++: Optimise GCD algorithms The current std::gcd and std::chrono::duration::_S_gcd algorithms are both recursive. This is potentially expensive to evaluate in constant expressions, because each level of recursion makes a new copy of the function to evaluate. The maximum number of steps is bounded (proportional to the number of decimal digits in the smaller value) and so unlikely to exceed the limit for constexpr nesting, but the memory usage is still suboptimal. By using an iterative algorithm we avoid that compile-time cost. Because looping in constexpr functions is not allowed until C++14, we need to keep the recursive implementation in duration::_S_gcd for C++11 mode. For std::gcd we can also optimise runtime performance by using the binary GCD algorithm. libstdc++-v3/ChangeLog: * include/std/chrono (duration::_S_gcd): Use iterative algorithm for C++14 and later. * include/std/numeric (__detail::__gcd): Replace recursive Euclidean algorithm with iterative version of binary GCD algorithm. * testsuite/26_numerics/gcd/1.cc: Test additional inputs. * testsuite/26_numerics/gcd/gcd_neg.cc: Adjust dg-error lines. * testsuite/26_numerics/lcm/lcm_neg.cc: Likewise. * testsuite/experimental/numeric/gcd.cc: Test additional inputs. * testsuite/26_numerics/gcd/2.cc: New test.	2020-09-03 12:46:13 +01:00
Jakub Jelinek	3536ff2de8	lto: Cache location_ts including BLOCKs in GIMPLE streaming [PR94311] As mentioned in the PR, when compiling valgrind even on fairly small testcase where in one larger function the location keeps oscillating between a small line number and 8000-ish line number in the same file we very quickly run out of all possible location_t numbers and because of that emit non-sensical line numbers in .debug_line. There are ways how to decrease speed of depleting location_t numbers in libcpp, but the main reason of this is that we use stream_input_location_now for streaming in location_t for gimple_location and phi arg locations. libcpp strongly prefers that the locations it is given are sorted by the different files and by line numbers in ascending order, otherwise it depletes quickly no matter what and is much more costly (many extra file changes etc.). The reason for not caching those were the BLOCKs that were streamed immediately after the location and encoded into the locations (and for PHIs we failed to stream the BLOCKs altogether). This patch enhances the location cache to handle also BLOCKs (but not for everything, only for the spots we care about the BLOCKs) and also optimizes the size of the LTO stream by emitting a single bit into a pack whether the BLOCK changed from last case and only streaming the BLOCK tree if it changed. 2020-09-03 Jakub Jelinek <jakub@redhat.com> PR lto/94311 * gimple.h (gimple_location_ptr, gimple_phi_arg_location_ptr): New functions. * streamer-hooks.h (struct streamer_hooks): Add output_location_and_block callback. Fix up formatting for output_location. (stream_output_location_and_block): Define. * lto-streamer.h (class lto_location_cache): Fix comment typo. Add current_block member. (lto_location_cache::input_location_and_block): New method. (lto_location_cache::lto_location_cache): Initialize current_block. (lto_location_cache::cached_location): Add block member. (struct output_block): Add current_block member. (lto_output_location): Formatting fix. (lto_output_location_and_block): Declare. * lto-streamer.c (lto_streamer_hooks_init): Initialize streamer_hooks.output_location_and_block. * lto-streamer-in.c (lto_location_cache::cmp_loc): Also compare block members. (lto_location_cache::apply_location_cache): Handle blocks. (lto_location_cache::accept_location_cache, lto_location_cache::revert_location_cache): Fix up function comments. (lto_location_cache::input_location_and_block): New method. (lto_location_cache::input_location): Implement using input_location_and_block. (input_function): Invoke apply_location_cache after streaming in all bbs. * lto-streamer-out.c (clear_line_info): Set current_block. (lto_output_location_1): New function, moved from lto_output_location, added block handling. (lto_output_location): Implement using lto_output_location_1. (lto_output_location_and_block): New function. * gimple-streamer-in.c (input_phi): Use input_location_and_block to input and cache both location and block. (input_gimple_stmt): Likewise. * gimple-streamer-out.c (output_phi): Use stream_output_location_and_block. (output_gimple_stmt): Likewise.	2020-09-03 12:51:01 +02:00
Richard Biener	b246f5272e	Improve constant folding of vector lowering with vector bools This improves the situation somewhat when vector lowering tries to access vector bools as seen in PR96814. 2020-09-03 Richard Biener <rguenther@suse.de> * tree-vect-generic.c (tree_vec_extract): Remove odd special-casing of boolean vectors. * fold-const.c (fold_ternary_loc): Handle boolean vector type BIT_FIELD_REFs.	2020-09-03 12:47:59 +02:00
Arnaud Charlet	3cc3a373fe	Preliminary work on support for 128bits integers * fe.h, opt.ads (Enable_128bit_Types): New. * stand.ads (Standard_Long_Long_Long_Integer, S_Long_Long_Long_Integer): New.	2020-09-03 04:34:48 -04:00

1 2 3 4 5 ...

179194 Commits All Branches Search

179194 Commits

All Branches