2019-03-21 Richard Biener <rguenther@suse.de>
PR tree-optimization/89779
* tree.c (tree_nop_conversion): Consolidate and fix defensive
checks with respect to released SSA names now having error_mark_node
type.
* fold-const.c (operand_equal_p): Likewise.
* gcc.dg/torture/pr89779.c: New testcase.
From-SVN: r269838
IS 29124 8.2 [sf.mathh] says that <math.h> should add the names of the
special functions to the global namespace. However, C++17 Annex D
[depr.c.headers] excludes those functions explicitly, so they should not
be placed in the global namespace unconditionally for C++17.
Only add them to the global namespace when IS 29124 is explicitly
requested via the __STDCPP_WANT_MATH_SPEC_FUNCS__ macro.
* include/c_compatibility/math.h [!__STDCPP_WANT_MATH_SPEC_FUNCS__]
(assoc_laguerre, assoc_laguerref, assoc_laguerrel, assoc_legendre)
(assoc_legendref, assoc_legendrel, beta, betaf, betal, comp_ellint_1)
(comp_ellint_1f, comp_ellint_1l, comp_ellint_2, comp_ellint_2f)
(comp_ellint_2l, comp_ellint_3, comp_ellint_3f, comp_ellint_3l)
(cyl_bessel_i, cyl_bessel_if, cyl_bessel_il, cyl_bessel_j)
(cyl_bessel_jf, cyl_bessel_jl, cyl_bessel_k, cyl_bessel_kf)
(cyl_bessel_kl, cyl_neumann, cyl_neumannf, cyl_neumannl, ellint_1)
(ellint_1f, ellint_1l, ellint_2, ellint_2f, ellint_2l, ellint_3)
(ellint_3f, ellint_3l, expint, expintf, expintl, hermite, hermitef)
(hermitel, laguerre, laguerref, laguerrel, legendre, legendref)
(legendrel, riemann_zeta, riemann_zetaf, riemann_zetal, sph_bessel)
(sph_besself, sph_bessell, sph_legendre, sph_legendref, sph_legendrel)
(sph_neumann, sph_neumannf, sph_neumannl): Only add using-declarations
when the special functions IS is enabled, not for C++17.
* testsuite/26_numerics/headers/cmath/functions_global_c++17.cc:
Replace with ...
* testsuite/26_numerics/headers/cmath/functions_global.cc: New test,
without checks for special functions in C++17.
* testsuite/26_numerics/headers/cmath/special_functions_global.cc:
New test.
From-SVN: r269837
These headers were missed in the previous commit for this bug.
There are also several "" includes in the profile mode headers, but
because they're deprecated I'm not fixing them.
* include/backward/hash_map: Use <> for includes not "".
* include/backward/hash_set: Likewise.
* include/backward/strstream: Likewise.
* include/tr1/bessel_function.tcc: Likewise.
* include/tr1/exp_integral.tcc: Likewise.
* include/tr1/legendre_function.tcc: Likewise.
* include/tr1/modified_bessel_func.tcc: Likewise.
* include/tr1/riemann_zeta.tcc: Likewise.
From-SVN: r269835
In functions whose return type is instantiated from a nested template,
make sure that all members of the instance are emitted before finishing
the outer function, otherwise they will be removed during the
prune_unused_types pass.
gcc/d/ChangeLog:
2019-03-21 Iain Buclaw <ibuclaw@gdcproject.org>
PR d/89017
* d-codegen.cc (d_decl_context): Skip over template instances when
finding the context.
* decl.cc (DeclVisitor::visit(TemplateDeclaration)): New override.
(build_type_decl): Include parameters in name of template types.
gcc/testsuite/ChangeLog:
2019-03-21 Iain Buclaw <ibuclaw@gdcproject.org>
PR d/89017
* gdc.dg/pr89017.d: New test.
From-SVN: r269828
The issue here is that declval<T>().d is considered instantiation-dependent
within a template, as the access to 'd' might depend on the particular
specialization. But when we're deducing template arguments for a call, we
know that the call and the arguments are non-dependent, so we can do the
substitution as though we aren't in a template. Which strictly speaking we
aren't, since the default argument is considered a separate definition.
* pt.c (type_unification_real): Accept a dependent result in
template context.
From-SVN: r269826
Even if a global register is being clobbered in a function we usually
do not save and restore it. However, we still have to do this if it is
a special register. Most of the places in the backend handle this
correctly but not the prologue/epilogue optimization.
gcc/ChangeLog:
2019-03-20 Andreas Krebbel <krebbel@linux.ibm.com>
PR target/89775
* config/s390/s390.c (global_not_special_regno_p): Move to make it
available to ...
(s390_optimize_register_info): Use global_not_special_regno_p to
check for global regs.
2019-03-20 Jakub Jelinek <jakub@redhat.com>
PR target/89775
* gcc.target/s390/pr89775-1.c: New test.
* gcc.target/s390/pr89775-2.c: New test.
From-SVN: r269823
gcc/
PR target/89411
* config/riscv/riscv.c (riscv_valid_lo_sum_p): New arg x. New locals
align, size, offset. Use them to handle a BLKmode reference. Update
comment.
(riscv_classify_address): Pass info->offset to riscv_valid_lo_sum_p.
gcc/testsuite/
PR target/89411
* gcc.target/riscv/losum-overflow.c: New test.
From-SVN: r269813
In the C calling convention, on AMD64, and probably a number of
other architectures, a 3-word struct argument is passed on stack.
This is less efficient than passing in three registers. Further,
this may affect the code generation in other part of the program,
even if the function is not actually called.
Slices are common in Go and append is a common slice operation,
which calls growslice in the growing path. To improve the code
generation, pass the slice header's three fields as separate
values, instead of a struct, to growslice.
The drawback is that this makes the runtime implementation
slightly diverges from the gc runtime.
Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/168277
From-SVN: r269811
gcc/ChangeLog:
* tree-ssa-strlen.c (handle_builtin_stxncpy): Use full_string_p
rather than endptr as an indicator of nul-termination.
From-SVN: r269809
2019-03-19 Martin Liska <mliska@suse.cz>
PR middle-end/89737
* predict.c (combine_predictions_for_bb): Empty likely_edges and
unlikely_edges if there's an edge that belongs to both these sets.
2019-03-19 Martin Liska <mliska@suse.cz>
PR middle-end/89737
* gcc.dg/pr89737.c: New test.
From-SVN: r269804
The "classic" PowerPCs (6xx/7xx) are not STRICT_ALIGNMENT, but their
floating point units are. This is not normally a problem, the ABIs
make everything FP aligned. The RTL patterns converting FP to integer
however get a potentially unaligned destination, and we do not want to
do an stfiwx on that on such older CPUs.
This fixes it. It does not change anything for TARGET_MFCRF targets
(POWER4 and later). It also won't change anything for strict-alignment
targets, or CPUs without hardware FP of course, or CPUs that do not
implement stfiwx (older 4xx/5xx/8xx).
It does not change the corresponding fixuns* pattern, because that can
not be enabled on any CPU that cannot handle unaligned FP well.
PR target/89746
* config/rs6000/rs6000.md (fix_trunc<mode>si2_stfiwx): If we have a
non-TARGET_MFCRF target, and the dest is memory but not 32-bit aligned,
go via a stack temporary.
From-SVN: r269802
PR lto/87809
PR lto/89335
* tree.c (free_lang_data_in_decl): Do not free context of C++
destrutors.
* g++.dg/lto/pr87089_0.C: New testcase.
* g++.dg/lto/pr87089_1.C: New testcase.
* g++.dg/lto/pr89335_0.C: New testcase.
From-SVN: r269799
Since aix/ppc64 has been added to GC toolchain, a mix between new and
old files were created in gcc toolchain.
This commit corrects this merge for aix/ppc64 and aix/ppc.
Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/167658
From-SVN: r269797
PR target/89506
* config/arm/arm.md (cmpsi2_addneg): Swap the alternatives and use
subs for the first alternative except when operands[3] is 1.
From-SVN: r269795
* doc/xml/manual/allocator.xml: Link to table documenting evolution
of extension allocators.
* doc/xml/manual/evolution.xml: Use angle brackets for header names.
Document new headers in 7.2, 8.1 and 9.1 releases.
* doc/xml/manual/using.xml: Adjust link target for new_allocator.
* doc/html/*: Regenerate.
From-SVN: r269794
PR rtl-optimization/89753
* loop-unroll.c (decide_unroll_constant_iterations): Make guard for
explicit unrolling factor even more robust.
From-SVN: r269791
PR target/89726
* config/i386/i386.c (ix86_expand_floorceildf_32): In ceil
compensation use x2 += 1 instead of x2 -= -1 and when honoring
signed zeros, do another copysign after the compensation.
* gcc.target/i386/fpprec-1.c (x): Add 6 new constants.
(expect_round, expect_rint, expect_floor, expect_ceil, expect_trunc):
Add expected results for them.
From-SVN: r269790
PR c/89734
* c-decl.c (grokdeclarator): Call c_build_qualified_type on function
return type even if quals_used is 0. Formatting fixes.
* gcc.dg/pr89734.c: New test.
From-SVN: r269789
gcc/ChangeLog:
PR tree-optimization/89720
* tree-vrp.c (vrp_prop::check_mem_ref): Treat range with max < min
more conservatively, the same as anti-range.
gcc/testsuite/ChangeLog:
PR tree-optimization/89720
* gcc.dg/Warray-bounds-42.c: New test.
From-SVN: r269785
Even though these two using-declarations have the same effect, they are not
the same declaration, and we don't need to work to treat them as the same
like we do for typedefs. If we did need to, we would need to handle them
specially in iterative_hash_template_arg as well as here.
* tree.c (cp_tree_equal): Always return false for USING_DECL.
From-SVN: r269777
In this testcase we get confused when looking at the sizeof... because the
argument pack for 'args' has been wrapped in an ARGUMENT_PACK_SELECT as part
of expanding the fold-expression. We handle this situation a bit lower down
in tsubst_pack_expansion, but that doesn't help the call to
argument_pack_element_is_expansion_p, which happens earlier.
* pt.c (argument_pack_element_is_expansion_p): Handle
ARGUMENT_PACK_SELECT.
From-SVN: r269776
My patch for PR 60503 to fix C++11 attribute parsing on lambdas accidentally
removed support for GNU attributes.
* parser.c (cp_parser_lambda_declarator_opt): Allow GNU attributes.
From-SVN: r269775
It currently wants to see lvx insns on AIX, and no lvx insns on Linux.
What is really wanted is lvx insns when no VSX, and lxv* insns if VSX.
This fixes it.
* gcc.target/powerpc/altivec-7.c: Look for lxv* if generating VSX
instructions, and lvx if not.
From-SVN: r269772
Currently these bswap testcases use global variables, which causes
problems with -m32: the memory access is a D-form access, and when
combine tries to combine that with the bswap it tries a D-form store
with byte reverse. That instruction does not exist, and since combine
started with only two insns here it will not try splitting this.
This should be improved, but it is not what this test is testing, and
the "load" case already uses a pointer, so let's do that for the store
case as well.
* gcc.target/powerpc/bswap16.c: Use a pointer instead of a global for
the "store" test as well.
* gcc.target/powerpc/bswap32.c: Ditto.
From-SVN: r269771
For the big stack frame in the test GCC used to say
pr18096-1.c:7:6: error: total size of local objects too large
but now it says
pr18096-1.c:7:6: error: total size of local objects 2147483647 exceeds maximum 2147483392
Let's just allow both in the test.
gcc/testsuite/
* gcc.target/powerpc/pr18096-1.c: Allow an error message that says
"exceeds" instead of just one that talks about "too large".
From-SVN: r269770
2019-03-18 Thomas Koenig <tkoeng@gcc.gnu.org>
PR fortran/68009
* iresolve.c: Include trans.h.
(gfc_resolve_fe_runtine_error): Set backend_decl on
resolved_sym.
From-SVN: r269769
Here we were pushing into the right access context, but we were called from
a deferred checking context, so didn't end up doing the checks until after
we left the access context.
* pt.c (tsubst_default_argument): Don't defer access checks.
From-SVN: r269766
2019-03-18 Richard Biener <rguenther@suse.de>
PR middle-end/88945
* tree-ssanames.c (release_ssa_name_fn): For released SSA names
use a TREE_TYPE of error_mark_node to avoid ICEs when dumping
basic-blocks that are removed. Remove restoring SSA_NAME_VAR.
* tree-outof-ssa.c (eliminate_useless_phis): Remove redundant checking.
From-SVN: r269765
2019-03-18 Andrew Stubbs <ams@codesourcery.com>
gcc/
* config/gcn/gcn-run.c (struct output): Make next_output unsigned.
Extend queue to 1024 entries.
Add "consumed" field.
(gomp_print_output): Remove print_index parameter.
Add final parameter.
Change limit to unsigned.
Use consumed field to implement circular buffer.
Detect interrupted print in final pass.
Flush output at the end.
(run): Update gomp_print_output usage.
(main): Initialize kernargs->output_data.consumed.
From-SVN: r269764
This patch fixes a case in which we vectorised something with a
fully-predicated loop even after the cost model had rejected it.
E.g. the loop in the testcase has the costs:
Vector inside of loop cost: 27
Vector prologue cost: 0
Vector epilogue cost: 0
Scalar iteration cost: 7
Scalar outside cost: 6
Vector outside cost: 0
prologue iterations: 0
epilogue iterations: 0
and we can see that the loop executes at most three times, but we
decided to vectorise it anyway.
(The costs here are equal for three iterations, but the same thing
happens even when the vector code is strictly more expensive.)
The problem is the handling of "/VF" in:
/* Calculate number of iterations required to make the vector version
profitable, relative to the loop bodies only. The following condition
must hold true:
SIC * niters + SOC > VIC * ((niters-PL_ITERS-EP_ITERS)/VF) + VOC
where
SIC = scalar iteration cost, VIC = vector iteration cost,
VOC = vector outside cost, VF = vectorization factor,
PL_ITERS = prologue iterations, EP_ITERS= epilogue iterations
SOC = scalar outside cost for run time cost model check. */
We treat the "/VF" as truncating, but for fully-predicated loops, it's
closer to a ceil division, since fractional iterations are handled by a
full iteration with some predicate bits set to false.
The easiest fix seemed to be to calculate the minimum number of vector
iterations first, then use that to calculate the minimum number of scalar
iterations.
Calculating the minimum number of vector iterations might make sense for
unpredicated loops too, since calculating the scalar niters directly
doesn't take into account the fact that the VIC multiple has to be an
integer. But the handling of PL_ITERS and EP_ITERS for unpredicated
loops is a bit hand-wavy anyway, so maybe vagueness here cancels out
vagueness there?
Either way, changing this for unpredicated loops would be much too
invasive for stage 4, so the patch keeps it specific to fully-predicated
loops (i.e. SVE) for now. There's no functional change for other targets.
2019-03-18 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* tree-vect-loop.c (vect_estimate_min_profitable_iters): Fix the
calculation of the minimum number of scalar iterations for
fully-predicated loops.
gcc/testsuite/
* gcc.target/aarch64/sve/cost_model_1.c: New test.
From-SVN: r269763