* config/rs6000/altivec.md (build_vector_mask_for_load): Use MEM_P.
* config/rs6000/constraints.md (Q constraint): Use REG_P.
* config/rs6000/darwin.h (PREFERRED_RELOAD_CLASS): Use SYMBOL_REF_P.
* config/rs6000/freebsd64.h (ASM_OUTPUT_SPECIAL_POOL_ENTRY_P): Use
SYMBOL_REF_P, CONST_INT_P and CONST_DOUBLE_P.
* config/rs6000/linux64.h (ASM_OUTPUT_SPECIAL_POOL_ENTRY_P): Likewise.
* config/rs6000/predicates.md (altivec_register_operand, vint_operand,
vsx_register_operand, vsx_reg_sfsubreg_ok, vfloat_operand,
vlogical_operand, gpc_reg_operand, int_reg_operand,
int_reg_operand_not_pseudo): Use SUBREG_P and HARD_REGISTER_P.
(ca_operand, base_reg_operand, htm_spr_reg_operand, cc_reg_operand,
cc_reg_not_cr0_operand, input_operand): Use SUBREG_P.
(save_world_operation, restore_world_operation, lmw_operation,
stmw_operation): Use MEM_P and REG_P.
(tie_operand): Use MEM_P.
(vrsave_operation, crsave_operation): Use REG_P.
(mfcr_operation, mtcrf_operation): Use REG_P and CONST_INT_P.
(fpr_reg_operand): Use SUBREG_P and HARD_REGISTER_NUM_P.
(quad_int_reg_operand): Use HARD_REGISTER_NUM_P.
(call_operand): Use HARD_REGISTER_P.
(indexed_or_indirect_operand, altivec_indexed_or_indirect_operand):
Use CONST_INT_P.
(lwa_operand): Use SUBREG_P, REG_P and CONST_INT_P.
* config/rs6000/rs6000-p8swap.c (insn_is_load_p, insn_is_store_p,
quad_aligned_load_p, replace_swapped_aligned_store,
recombine_lvx_pattern, replace_swapped_aligned_load,
recombine_stvx_pattern): Use MEM_P.
(const_load_sequence_p, adjust_vperm, replace_swapped_load_constant):
Use MEM_P and SYMBOL_REF_P.
(rtx_is_swappable_p): Use REG_P and CONST_INT_P.
(insn_is_swappable_p): Use REG_P and MEM_P.
(insn_is_swap_p, (alignment_mask): Use CONST_INT_P.
* config/rs6000/rs6000-string.c (expand_block_clear, expand_block_move):
Use CONST_INT_P.
* config/rs6000/rs6000.c (rs6000_secondary_reload, rs6000_emit_cmove):
Use CONST_DOUBLE_P.
(rs6000_output_move_128bit): Use CONST_DOUBLE_P, CONST_INT_P and
CONST_WIDE_INT_P.
(rs6000_legitimize_address): Use CONST_DOUBLE_P, CONST_INT_P,
CONST_WIDE_INT_P, REG_P and SYMBOL_REF_P.
(rs6000_emit_move): Use CONST_DOUBLE_P, CONST_INT_P, HARD_REGISTER_P,
HARD_REGISTER_NUM_P, MEM_P, REG_P, SUBREG_P, SYMBOL_REF_P and
reg_or_subregno:
(output_toc): Use CONST_DOUBLE_P, CONST_INT_P and SYMBOL_REF_P.
(easy_altivec_constant, rs6000_legitimate_offset_address_p,
rs6000_mode_dependent_address, rs6000_expand_mtfsf_builtin,
rs6000_expand_set_fpscr_rn_builtin, rs6000_expand_set_fpscr_drn_builtin,
rs6000_expand_unop_builtin, INT_P, rs6000_generate_compare,
rs6000_machopic_legitimize_pic_address, rs6000_split_logical_inner,
rs6000_split_logical_di): Use CONST_INT_P.
(rs6000_legitimize_reload_address): Use CONST_INT_P, HARD_REGISTER_P,
REG_P and SYMBOL_REF_P.
(setup_incoming_varargs, rs6000_rtx_costs): Use CONST_INT_P and MEM_P.
(print_operand): Use CONST_INT_P, MEM_P and REG_P.
(virtual_stack_registers_memory_p, rs6000_legitimate_address_p,
mems_ok_for_quad_peep): Use CONST_INT_P and REG_P.
(rs6000_secondary_reload_memory): Use CONST_INT_P and SUBREG_P.
(small_data_operand, print_operand_address): Use CONST_INT_P and
SYMBOL_REF_P.
(split_stack_arg_pointer_used_p): Use HARD_REGISTER_P.
(rs6000_init_hard_regno_mode_ok, direct_move_p):
Use HARD_REGISTER_NUM_P.
(rs6000_secondary_reload_gpr): Use HARD_REGISTER_NUM_P and MEM_P.
(rs6000_secondary_reload_class): Use HARD_REGISTER_NUM_P, REG_P,
SUBREG_P and SYMBOL_REF_P.
(register_to_reg_type, rs6000_secondary_reload_inner): Use SUBREG_P
and HARD_REGISTER_NUM_P.
(rs6000_adjust_vec_address): Use HARD_REGISTER_NUM_P and
reg_or_subregno.
(rs6000_adjust_cost, find_mem_ref): Use MEM_P.
(macho_lo_sum_memory_operand, rs6000_eliminate_indexed_memrefs): Use
MEM_P and REG_P.
(legitimate_indirect_address_p, legitimate_lo_sum_address_p,
registers_ok_for_quad_peep, rs6000_output_function_epilogue,
find_addr_reg): Use REG_P.
(altivec_expand_vec_perm_const): Use REG_P and SUBREG_P.
(rs6000_emit_le_vsx_move): Use SUBREG_P.
(offsettable_ok_by_alignment, constant_pool_expr_p,
legitimate_small_data_p, rs6000_output_dwarf_dtprel,
rs6000_delegitimize_address, rs6000_const_not_ok_for_debug_p,
rs6000_cannot_force_const_mem, rs6000_output_addr_const_extra,
rs6000_assemble_integer, create_TOC_reference,
rs6000_emit_allocate_stack, rs6000_xcoff_encode_section_info,
rs6000_call_aix, rs6000_call_aix): Use SYMBOL_REF_P.
(rs6000_split_vec_extract_var): Use reg_or_subregno.
* config/rs6000/rtems.h (ASM_OUTPUT_SPECIAL_POOL_ENTRY_P): Use
CONST_DOUBLE_P, CONST_INT_P and SYMBOL_REF_P.
* config/rs6000/sysv4.h (ASM_OUTPUT_SPECIAL_POOL_ENTRY_P): Likewise.
* config/rs6000/xcoff.h (ASM_OUTPUT_SPECIAL_POOL_ENTRY_P): Likewise.
* config/rs6000/rs6000.h (RS6000_SYMBOL_REF_TLS_P): Use SYMBOL_REF_P.
(REGNO_OK_FOR_INDEX_P, REGNO_OK_FOR_BASE_P): Use HARD_REGISTER_NUM_P.
(INT_REG_OK_FOR_INDEX_P, INT_REG_OK_FOR_BASE_P): Use HARD_REGISTER_P.
(CONSTANT_ADDRESS_P): Use CONST_INT_P and SYMBOL_REF_P.
* config/rs6000/rs6000.md (define_expands strlensi, mod<mode>3
and cbranch<mode>4): Use CONST_INT_P.
(multiple define_splits): Use REG_P and SUBREG_P.
(define_expands call, call_value): Use MEM_P.
(define_expands sibcall, sibcall_value): Use CONST_INT_P and MEM_P.
(define insn *mtcrfsi): Use CONST_INT_P and REG_P.
* config/rs6000/vsx.md (*vsx_le_perm_load_<mode>,
*vsx_le_perm_load_v8hi, *vsx_le_perm_load_v16qi): Use HARD_REGISTER_P
and HARD_REGISTER_NUM_P.
(multiple define_splits): Use HARD_REGISTER_NUM_P.
From-SVN: r268253
It used to be the case that the mangled name of a reference temporary didn't
need to be standardized, because all access would be through the reference.
But now constant expressions can look through references and so different
translation units need to agree on the address of a temporary in the
initializer of a reference with vague linkage.
* cp-tree.h (struct saved_scope): Add ref_temp_count.
(current_ref_temp_count): New macro.
* mangle.c (mangle_ref_init_variable): Use it.
* typeck2.c (store_init_value): Clear it.
* call.c (make_temporary_var_for_ref_to_temp): Copy public and
comdat.
From-SVN: r268252
PR target/88948
* rtl.h (prepare_copy_insn): New prototype.
* gcse.c (prepare_copy_insn): New function, split out from
process_insert_insn.
(process_insert_insn): Use prepare_copy_insn.
* store-motion.c (replace_store_insn): Use prepare_copy_insn
instead of gen_move_insn.
testsuite/ChangeLog:
PR target/88948
* gcc.target/i386/pr88948.c: New test.
From-SVN: r268249
PR c++/88976
* c-typeck.c (c_finish_omp_cancel): Diagnose more than one if
on #pragma omp cancel with different modifiers.
* semantics.c (finish_omp_cancel): Diagnose more than one if
on #pragma omp cancel with different modifiers. Use
maybe_convert_cond when not in template or build_x_binary_op
otherwise.
* c-c++-common/gomp/cancel-2.c: New test.
* gcc.dg/gomp/cancel-1.c: New test.
* g++.dg/gomp/cancel-1.C: New test.
* g++.dg/gomp/cancel-2.C: New test.
* g++.dg/gomp/cancel-3.C: New test.
From-SVN: r268245
This patch, for gcc 8/9 is a mitigation patch for PR target/88469
where gcc-6/7/8 miscompile a structure whose alignment is dominated by
a 64-bit bitfield member. Since the PCS rules for such a type must
ignore any overalignment of the base type we cannot address this by
simply adding a larger alignment to the class. We can, however, force
the alignment of the bit-field itself and GCC will handle that as
desired.
PR target/88469
* profile-count.h (profile_count): On ARM systems using GCC 6/7/8
force the alignment of m_val.
From-SVN: r268240
Clang fails to compile std::vector<Incomplete> because the static member
__use_relocate cannot be evaluated for an incomplete type. Replace with
a static member function that will not be odr-used until needed, by
which point the type must be complete.
PR libstdc++/88840
* include/bits/stl_vector.h (vector::__use_relocate): Replace static
data member with static member function _S_use_relocate().
* include/bits/vector.tcc (vector::reserve, vector::_M_realloc_insert)
(vector::_M_default_append): Use _S_use_relocate() instead of
__use_relocate.
From-SVN: r268239
I wrote a test-case:
...
int
main (void)
{
for (unsigned i = 0; i < 128; ++i)
{
acc_init (acc_device_nvidia);
acc_shutdown (acc_device_nvidia);
}
return 0;
}
...
and ran it under valgrind. The only leak location reported with a frequency
of 128, was the allocation of ptx_devices in nvptx_init.
Fix this by freeing ptx_devices in GOMP_OFFLOAD_fini_device, once
instantiated_devices drops to 0.
2019-01-24 Tom de Vries <tdevries@suse.de>
* plugin/plugin-nvptx.c (GOMP_OFFLOAD_fini_device): Free ptx_devices
once instantiated_devices drops to 0.
From-SVN: r268237
2019-01-24 Richard Biener <rguenther@suse.de>
PR lto/87187
* tree-streamer-out.c (write_ts_decl_common_tree_pointers):
When in "legacy" debug mode make sure to reset self-origins.
From-SVN: r268236
2019-01-24 Martin Liska <mliska@suse.cz>
PR gcov-profile/88994
* gcov-io.c (mangle_path): Do not allocate a bigger buffer,
result will be always smaller or equal to the original.
* gcov.c (mangle_name): Fix else branch where we should
also copy to PTR and shift the pointer.
From-SVN: r268233
2019-01-24 Paul Thomas <pault@gcc.gnu.org>
PR fortran/88929
* trans-array.c (gfc_conv_descriptor_elem_len): New function.
* trans-array.h : Add prototype for above.
* trans-expr.c (gfc_conv_gfc_desc_to_cfi_desc): Take account of
assumed rank arrays being flagged by rank = -1 in expressions.
Intent in arrays need a pointer to a copy of the data to be
assigned to the descriptor passed for conversion. This should
then be freed, together with the CFI descriptor on return from
the C call.
2019-01-24 Paul Thomas <pault@gcc.gnu.org>
PR fortran/88929
* gfortran.dg/ISO_Fortran_binding_3.f90 : New test
* gfortran.dg/ISO_Fortran_binding_3.c : Subsidiary source.
From-SVN: r268232
2019-01-24 Paul Thomas <pault@gcc.gnu.org>
PR fortran/88929
* trans-array.c (gfc_conv_descriptor_elem_len): New function.
* trans-array.h : Add prototype for above.
* trans-expr.c (gfc_conv_gfc_desc_to_cfi_desc): Take account of
assumed rank arrays being flagged by rank = -1 in expressions.
Intent in arrays need a pointer to a copy of the data to be
assigned to the descriptor passed for conversion. This should
then be freed, together with the CFI descriptor on return from
the C call.
2019-01-24 Paul Thomas <pault@gcc.gnu.org>
PR fortran/88929
* gfortran.dg/ISO_Fortran_binding_3.f90 : New test
* gfortran.dg/ISO_Fortran_binding_3.c : Subsidiary source.
From-SVN: r268231
Precise stack scan uses SIGURG to trigger a stack scan. We need
to have Go signal handler installed for SIGURG.
Reviewed-on: https://go-review.googlesource.com/c/159097
From-SVN: r268230
ChangeLog
2019-01-24 Xiong Hu Luo <luoxhu@linux.vnet.ibm.com>
* ChangeLog: replace space with tab.
* MAINTAINERS: delete 1 tab to keep alignment.
From-SVN: r268228
2019-01-23 Bin Cheng <bin.cheng@arm.com>
Steve Ellcey <sellcey@marvell.com>
PR target/85711
* recog.c (address_operand): Return false on wrong mode for address.
(constrain_operands): Check for mode with 'p' constraint.
Co-Authored-By: Steve Ellcey <sellcey@marvell.com>
From-SVN: r268219
FSTYPE FUNC (DWtype u) in libgcc2.c, which converts DI/TI to SF/DF, has
/* No leading bits means u == minimum. */
if (count == 0)
return -(Wtype_MAXp1_F * (Wtype_MAXp1_F / 2));
in the third case (where actually count == 0 only means the high part is
minimum). It should be:
/* No leading bits means u == minimum. */
if (count == 0)
return Wtype_MAXp1_F * (FSTYPE) (hi | ((UWtype) u != 0));
instead.
gcc/testsuite/
2019-01-23 H.J. Lu <hongjiu.lu@intel.com>
PR libgcc/88931
* gcc.dg/torture/fp-int-convert-timode-1.c: New test.
* gcc.dg/torture/fp-int-convert-timode-2.c: Likewise.
* gcc.dg/torture/fp-int-convert-timode-3.c: Likewise.
* gcc.dg/torture/fp-int-convert-timode-4.c: Likewise.
libgcc/
2019-01-23 Joseph Myers <joseph@codesourcery.com>
PR libgcc/88931
* libgcc2.c (FSTYPE FUNC (DWtype u)): Correct no leading bits
case.
From-SVN: r268216
* parser.c (cp_parser_direct_declarator): don't treat qualified-ids
in parameter-list as types if name lookup for declarator-id didn't
find one or more function templates.
* g++.dg/cpp0x/dependent2.c: new test.
* g++.dg/cpp2a/typename10.c: remove dg-error.
* g++.dg/cpp2a/typename12.c: new test.
* g++.dg/template/static30.c: remove dg-error.
From-SVN: r268192
Running:
$ valgrind ./xgcc -B. -c test.c -march=native
on aarch64 shows a use-after-free in host_detect_local_cpu due
to the std::string result of aarch64_get_extension_string_for_isa_flags
only living until immediately after a c_str call.
This leads to corrupt "-march=" values being passed to cc1.
This patch fixes the use-after-free, though it appears to also need
Tamar's patch here:
https://gcc.gnu.org/ml/gcc-patches/2018-12/msg01302.html
in order to generate valid values for cc1. This may have worked by
accident in the past, if the corrupt "-march=" value happened to be
0-terminated in the "right" place; with this patch it now appears
to reliably break without Tamar's patch.
gcc/ChangeLog:
PR driver/89014
* config/aarch64/driver-aarch64.c (host_detect_local_cpu): Fix
use-after-free of the result of
aarch64_get_extension_string_for_isa_flags.
From-SVN: r268189
PR c/44715
* cp-gimplify.c (genericize_cp_loop): Call begin_bc_block only
after genericizing cond and incr expressions.
* doc/extend.texi: Document break and continue behavior in
statement expressions.
* c-c++-common/pr44715.c: New test.
From-SVN: r268188
2019-01-23 Richard Biener <rguenther@suse.de>
PR tree-optimization/89008
* tree-ssa-reassoc.c (eliminate_using_constants): For * 0 do
not leave another stray operand.
* gcc.dg/torture/pr89008.c: New testcase.
From-SVN: r268186
* cgraphunit.c (cgraph_node::expand_thunk): When expanding a GIMPLE
thunk that returns by reference, use the type of the return object
of the thunk instead of that of the alias to build the dereference.
From-SVN: r268182
Atomics use DMB instruction to enforce ordering of loads/stores.
Currently gcc generates DMB w/o any arg which is a no-op. Fix that by
generating DMB 3 which enforces R+W ordering. It is stricter than what
acq/rel expect, but there's no other way.
gcc/
2019-xx-xx Vineet Gupta <vgupta@synopsys.com>
* config/arc/atomic.md: Add operand to DMB instruction
From-SVN: r268181
2019-01-23 Tom de Vries <tdevries@suse.de>
* dwarf.c (struct unit): Use size_t for low_offset/high_offset fields.
(units_search, find_unit): Use size_t for offset.
(build_address_map): Use size_t for unit_offset.
From-SVN: r268180
PR tree-optimization/88964
* gimple-loop-interchange.cc (loop_cand::analyze_induction_var): Use
build_zero_cst instead of build_int_cst. Return false for loop
invariants which honor signed zeros.
* gfortran.dg/pr88964.f90: New test.
From-SVN: r268179
Consider test-case:
...
int
main (void)
{
#pragma acc parallel async
;
#pragma acc parallel async
;
#pragma acc wait
return 0;
}
...
This fails with:
...
libgomp: cuMemAlloc error: invalid argument
Segmentation fault (core dumped)
...
The cuMemAlloc error is due to the fact that we're try to allocate 0 bytes.
Fix this by preventing calling map_push with size zero argument in nvptx_exec.
This also has the consequence that for the abort-1.c test-case, we end up
calling cuMemFree during map_fini for the struct cuda_map allocated in
map_init, which fails because an abort happened. Fix this by calling
cuMemFree with CUDA_CALL_NOCHECK in cuda_map_destroy.
2019-01-23 Tom de Vries <tdevries@suse.de>
PR target/PR88946
* plugin/plugin-nvptx.c (cuda_map_destroy): Use CUDA_CALL_NOCHECK for
cuMemFree.
(nvptx_exec): Don't call map_push if mapnum == 0.
* testsuite/libgomp.oacc-c-c++-common/pr88946.c: New test.
From-SVN: r268178
There are currently two situations where this assert triggers:
...
libgomp/plugin/plugin-nvptx.c: map_fini: Assertion `!s->map->active' failed.
...
First, in abort-1.c, a parallel region triggering an abort:
...
int
main (void)
{
#pragma acc parallel
abort ();
return 0;
}
...
The abort is detected in nvptx_exec as the CUDA_ERROR_ILLEGAL_INSTRUCTION
return status of the cuStreamSynchronize call after kernel launch, which is
then handled by calling non-returning function GOMP_PLUGIN_fatal.
Consequently, the map_pop in nvptx_exec that in case of cuStreamSynchronize
success would remove or inactive the element added by the map_push earlier in
nvptx_exec, does not trigger. With the element no longer active, but still
marked active and a member of s->map, we run into the assert during
GOMP_OFFLOAD_fini_device, which is triggered from atexit handler
gomp_target_fini (which is triggered by the GOMP_PLUGIN_fatal mentioned above
calling exit).
Second, in pr88941.c, an async parallel region without wait:
...
int
main (void)
{
#pragma acc parallel async
;
/* no #pragma acc wait */
return 0;
}
...
Because nvptx_exec is handling an async region, it does not call map_pop for
the element added by map_push, but schedules an kernel execution completion
event to call map_pop. Again, we run into the assert during
GOMP_OFFLOAD_fini_device, which is triggered from atexit handler
gomp_target_fini, but the exit in this case is triggered by returning from main.
So either the kernel is still running, or the kernel has completed but the
corresponding event that is supposed to call map_pop is stuck in the event
queue, waiting for an event_gc.
Fix this by removing the assert, and skipping the freeing of device memory if
the map is still marked active (though in the async case, this is more a
workaround than an fix).
2019-01-23 Tom de Vries <tdevries@suse.de>
PR target/88941
PR target/88939
* plugin/plugin-nvptx.c (cuda_map_destroy): Handle map->active case.
(map_fini): Remove "assert (!s->map->active)".
* testsuite/libgomp.oacc-c-c++-common/pr88941.c: New test.
From-SVN: r268177
The map field of a struct ptx_stream is a FIFO. The FIFO is implemented as a
single linked list, with pop-from-the-front semantics.
The function map_pop pops an element, either by:
- deallocating the element, if there is more than one element
- or marking the element inactive, if there's only one element
The responsibility of map_push is to push an element to the back, as well as
selecting the element to push, by:
- allocating an element, or
- reusing the element at the front if inactive and big enough, or
- dropping the element at the front if inactive and not big enough, and
allocating one that's big enough
The current implemention gets at least the first and most basic scenario wrong:
> map = cuda_map_create (size);
We create an element, and assign it to map.
> for (t = s->map; t->next != NULL; t = t->next)
> ;
We determine the last element in the fifo.
> t->next = map;
We append the new element.
> s->map = map;
But here, we throw away the rest of the FIFO, and declare the FIFO to be just
the new element.
This problem causes the test-case asyncwait-1.c to fail intermittently on some
systems. The pr87835.c test-case added here is a a minimized and modified
version of asyncwait-1.c (avoiding the kernel construct) that is more likely to
fail.
Fix this by rewriting map_pop more robustly, by:
- seperating the function in two phases: select element, push element
- when reusing or dropping an element, making sure that the element is cleanly
popped from the queue
- rewriting the push element part in such a way that it can handle all cases
without needing if statements, such that each line is exercised for each of
the three cases.
2019-01-23 Tom de Vries <tdevries@suse.de>
PR target/87835
* plugin/plugin-nvptx.c (map_push): Fix adding of allocated element.
* testsuite/libgomp.oacc-c-c++-common/pr87835.c: New test.
From-SVN: r268176
gcc/testsuite/ChangeLog:
* c-c++-common/Warray-bounds-2.c: Include headers only if they exist.
* c-c++-common/Warray-bounds-3.c: Make xfails conditional on target
non_strict_align.
* c-c++-common/Wrestrict-2.c: Include headers only if they exist.
* c-c++-common/Wrestrict.c: Make xfails conditional on target
non_strict_align.
From-SVN: r268175
It is enabled at -O3, but the doc currently says -O2. This fixes it.
* doc/invoke.texi (-fsplit-paths): This is enabled by default at -O3.
From-SVN: r268170
PR target/88965
* config/rs6000/rs6000.c: Include tree-vrp.h and tree-ssanames.h.
(rs6000_gimple_fold_builtin): If MEM_REF address doesn't satisfy
is_gimple_mem_ref_addr predicate, force it into a SSA_NAME first.
* gcc.target/powerpc/pr88965.c: New test.
From-SVN: r268166
PR middle-end/88968
* gimplify.c (gimplify_omp_atomic): Handle bitfield atomics with
non-integral DECL_BIT_FIELD_REPRESENTATIVEs.
* c-omp.c (c_finish_omp_atomic): For bitfield atomics, update type
variable after using BIT_FIELD_REF.
* c-c++-common/gomp/atomic-23.c: New test.
From-SVN: r268165