gcc/ada/
* libgnat/s-valuti.ads (Starts_As_Exponent_Format_Ghost): Ghost
function to determine if a string is recognized as something
which might be an exponent.
(Is_Opt_Exponent_Format_Ghost): Ghost function to determine if a
string has the correct format for an optional exponent.
(Scan_Exponent): Use ghost functions to factorize contracts.
gcc/ada/
* exp_util.ads (Get_Current_Value_Condition): Belt: Add a
postcondition that Val /= Var.
* sem_util.adb (Known_Null): Suspenders: Raise Program_Error if
Get_Current_Value_Condition returned the same value. This will
be enabled even without assertions, because infinite recursion
is a nuisance -- better to crash if this bug ever occurs.
gcc/ada/
* libgnat/s-exnint.ads: Mark in SPARK. Adapt to change to
package.
* libgnat/s-exnlli.ads: Likewise.
* libgnat/s-exnllli.ads: Likewise.
* libgnat/s-exponn.adb: Add lemmas and ghost code. Secial case
value zero as Left or Right to simplify proof.
* libgnat/s-exponn.ads: Transform the generic function into a
generic package with a function inside. Add a functional
contract.
gcc/ada/
* exp_ch3.adb (Make_Controlling_Function_Wrappers): Create
distinct copies of parameter lists for spec and body with
Copy_Parameter_List; cleanup.
(Make_Null_Procedure_Specs): Fix style in comments; remove a
potentially unnecessary initialization of a local variable.
This makes sure to detect overflow when computing DR_GROUP_GAP
and DR_GROUP_SIZE more thoroughly so artificial testcases like the
added one are not fooling the existing check.
2022-01-05 Richard Biener <rguenther@suse.de>
PR tree-optimization/103816
* tree-vect-data-refs.c (vect_analyze_group_access_1): Also
check DR_GROUP_GAP compute for overflow and representability.
* gcc.dg/torture/pr103816.c: New testcase.
For ADDR_EXPR gimple_debug_bind_get_value fold_stmt_1 uses
maybe_canonicalize_mem_ref_addr earlier and I think that should
resolve the concerns raised in PR52329. But folding ADDR_EXPR
operand using maybe_fold_reference and then taking address of that
looks like an invalid transformation, it can transform
# DEBUG D.4293 => &a[0]
into
# DEBUG D.4293 => &2.0e+0
etc., all we want to allow are the lhs folding of the operand which
maybe_fold_reference no longer does since r12-21-g0bf8cd9d5e8ac.
2022-01-05 Jakub Jelinek <jakub@redhat.com>
PR fortran/103691
* gimple-fold.c (fold_stmt_1): Don't call maybe_fold_reference
for DEBUG stmts with ADDR_EXPR gimple_debug_bind_get_value,
it can do unwanted rhs folding like &a[0] into &2.0 etc.
* gfortran.dg/pr103691.f90: New test.
The testcase uses SSE and SSE2 intrinsics, so fails on i686-linux
if -msse2 isn't enabled by default. Fixed by adding -msse2 to
dg-options.
2022-01-05 Jakub Jelinek <jakub@redhat.com>
PR target/103895
* gcc.target/i386/pr103895.c: Add -msse2 to dg-options.
Power ISA 2.07 (Power8) introduces transactional memory
feature but ISA3.1 (Power10) removes it. It exposes one
troublesome issue as PR102059 shows. Users define some
function with target pragma cpu=power10 then it calls one
function with attribute always_inline which inherits
command line option -mcpu=power8 which enables HTM
implicitly. The current isa_flags check doesn't allow this
inlining due to "target specific option mismatch" and error
mesasge is emitted.
Normally, the callee function isn't intended to exploit HTM
feature, but the default flag setting make it look it has.
As Richi raised in the PR, we have fp_expressions flag in
function summary, and allow us to check the function
actually contains any floating point expressions to avoid
overkill. So this patch follows the similar idea but is
more target specific, for this rs6000 port specific
requirement on HTM feature check, we would like to check
rs6000 specific HTM built-in functions and inline assembly,
it allows targets to do their own customized checks and
updates.
It introduces two target hooks need_ipa_fn_target_info and
update_ipa_fn_target_info. The former allows target to do
some previous check and decides to collect target specific
information for this function or not. For some special
case, it can predict the analysis result and set it early
without any scannings. The latter allows the
analyze_function_body to pass gimple stmts down just like
fp_expressions handlings, target can do its own tricks.
I put them together as one hook initially with one boolean
to indicate whether it's initial time, but the code looks a
bit ugly, to separate them seems to have better readability.
gcc/ChangeLog:
PR ipa/102059
* config/rs6000/rs6000.c (TARGET_NEED_IPA_FN_TARGET_INFO): New macro.
(TARGET_UPDATE_IPA_FN_TARGET_INFO): Likewise.
(rs6000_need_ipa_fn_target_info): New function.
(rs6000_update_ipa_fn_target_info): Likewise.
(rs6000_can_inline_p): Adjust for ipa function summary target info.
* config/rs6000/rs6000.h (RS6000_FN_TARGET_INFO_HTM): New macro.
* ipa-fnsummary.c (ipa_dump_fn_summary): Adjust for ipa function
summary target info.
(analyze_function_body): Adjust for ipa function summary target info
and call hook rs6000_need_ipa_fn_target_info and
rs6000_update_ipa_fn_target_info.
(ipa_merge_fn_summary_after_inlining): Adjust for ipa function summary
target info.
(inline_read_section): Likewise.
(ipa_fn_summary_write): Likewise.
* ipa-fnsummary.h (ipa_fn_summary::target_info): New member.
* doc/tm.texi: Regenerate.
* doc/tm.texi.in (TARGET_UPDATE_IPA_FN_TARGET_INFO): Document new hook.
(TARGET_NEED_IPA_FN_TARGET_INFO): Likewise.
* target.def (update_ipa_fn_target_info): New hook.
(need_ipa_fn_target_info): Likewise.
* targhooks.c (default_need_ipa_fn_target_info): New function.
(default_update_ipa_fn_target_info): Likewise.
* targhooks.h (default_update_ipa_fn_target_info): New declare.
(default_need_ipa_fn_target_info): Likewise.
gcc/testsuite/ChangeLog:
PR ipa/102059
* gcc.dg/lto/pr102059-1_0.c: New test.
* gcc.dg/lto/pr102059-1_1.c: New test.
* gcc.dg/lto/pr102059-1_2.c: New test.
* gcc.dg/lto/pr102059-2_0.c: New test.
* gcc.dg/lto/pr102059-2_1.c: New test.
* gcc.dg/lto/pr102059-2_2.c: New test.
* gcc.target/powerpc/pr102059-1.c: New test.
* gcc.target/powerpc/pr102059-2.c: New test.
* gcc.target/powerpc/pr102059-3.c: New test.
Add V2QImode logic operations with SSE and GP registers and split
them to V4QImode SSE instructions or SImode GP instructions.
The patch also fixes PR target/103900.
2022-01-04 Uroš Bizjak <ubizjak@gmail.com>
gcc/ChangeLog:
PR target/103861
* config/i386/mmx.md (one_cmplv2qi3): New insn pattern.
(one_cmplv2qi3 splitters): New post-reload splitters.
(*andnotv2qi3): New insn pattern.
(andnotv2qi3 splitters): New post-reload splitters.
(<any_logic:code>v2qi3): New insn pattern.
(<any_logic:insn>v2qi3 splitters): New post-reload splitters.
gcc/testsuite/ChangeLog:
PR target/103861
* gcc.target/i386/warn-vect-op-2.c: Adjust warnings.
* gcc.target/i386/pr103900.c: New test.
Change email address in both DCO and Write After Approval list.
2022-01-04 Gaius Mulley <gaiusmod2@gmail.com>
ChangeLog:
* MAINTAINERS: Change of email address in both DCO and
Write After Approval list.
Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>
Bool pattern detection doesn't really handle PHIs well so we have
to be prepared for mismatched vector types in more cases than
originally thought.
2022-01-04 Richard Biener <rguenther@suse.de>
PR tree-optimization/103800
* tree-vect-loop.c (vectorizable_phi): Remove assert and
expand comment.
* gcc.dg/vect/bb-slp-pr103800.c: New testcase.
Related to r12-6208-gebc853deb7cc0487de9ef6e891a007ba853d1933
"libgomp: Fix GOMP_DEVICE_NUM_VAR stringification during offload image load"
That commit fixed an issue with omp_get_device_num() on gcn/nvptx that
resulted in having always the value 0.
This commit modifies the tests to iterate over all devices such that on a
multi-nonhost-device system it had detected that always-zero issue.
libgomp/ChangeLog:
* testsuite/libgomp.c-c++-common/target-45.c: Iterate over all devices.
* testsuite/libgomp.fortran/target10.f90: Likewise.
This avoids running simple_dce_from_worklist on partially not up-to-date
SSA form (in unreachable code regions) by scheduling CFG cleanup
manually as is done anyway when tail-merging runs.
2022-01-04 Richard Biener <rguenther@suse.de>
PR tree-optimization/103690
* tree-pass.h (tail_merge_optimize): Adjust.
* tree-ssa-tail-merge.c (tail_merge_optimize): Pass in whether
to re-split critical edges, move CFG cleanup ...
* tree-ssa-pre.c (pass_pre::execute): ... here, before
simple_dce_from_worklist and delay freeing inserted_exprs from
...
(fini_pre): .. here.
This patch to the nvptx backend changes the backend's STORE_FLAG_VALUE
from -1 to 1, by using BImode predicates and selp instructions, instead
of set instructions (almost always followed by integer negation).
Historically, it was reasonable (through rare) for backends to use -1
for representing true during the RTL passes. However with tree-ssa,
GCC now emits lots of code that reads and writes _Bool values, requiring
STORE_FLAG_VALUE=-1 targets to frequently convert 0/-1 pseudos to 0/1
pseudos using integer negation. Unfortunately, this process prevents
or complicates many optimizations (negate isn't associative with logical
AND, OR and XOR, and interferes with range/vrp/nonzerobits bounds etc.).
The impact of this is that for a relatively simple logical expression
like "return (x==21) && (y==69);", the nvptx backend currently generates:
mov.u32 %r26, %ar0;
mov.u32 %r27, %ar1;
set.u32.eq.u32 %r30, %r26, 21;
neg.s32 %r31, %r30;
mov.u32 %r29, %r31;
set.u32.eq.u32 %r33, %r27, 69;
neg.s32 %r34, %r33;
mov.u32 %r32, %r34;
cvt.u16.u8 %r39, %r29;
mov.u16 %r36, %r39;
cvt.u16.u8 %r39, %r32;
mov.u16 %r37, %r39;
and.b16 %r35, %r36, %r37;
cvt.u32.u16 %r38, %r35;
cvt.u32.u8 %value, %r38;
This patch tweaks nvptx to generate 0/1 values instead, requiring the
same number of instructions, using (BImode) predicate registers and selp
instructions so as to now generate the almost identical:
mov.u32 %r26, %ar0;
mov.u32 %r27, %ar1;
setp.eq.u32 %r31, %r26, 21;
selp.u32 %r30, 1, 0, %r31;
mov.u32 %r29, %r30;
setp.eq.u32 %r34, %r27, 69;
selp.u32 %r33, 1, 0, %r34;
mov.u32 %r32, %r33;
cvt.u16.u8 %r39, %r29;
mov.u16 %r36, %r39;
cvt.u16.u8 %r39, %r32;
mov.u16 %r37, %r39;
and.b16 %r35, %r36, %r37;
cvt.u32.u16 %r38, %r35;
cvt.u32.u8 %value, %r38;
The hidden benefit is that this sequence can (in theory) be optimized
by the RTL passes to eventually generate a much shorter sequence using
an and.pred instruction (just like Nvidia's nvcc compiler).
This patch has been tested nvptx-none with a "make" and "make -k check"
(including newlib) hosted on x86_64-pc-linux-gnu with no new failures.
gcc/ChangeLog:
* config/nvptx/nvptx.h (STORE_FLAG_VALUE): Change to 1.
* config/nvptx/nvptx.md (movbi): Use P1 constraint for true.
(setcc_from_bi): Remove SImode specific pattern.
(setcc<mode>_from_bi): Provide more general HSDIM pattern.
(extendbi<mode>2, zeroextendbi<mode>2): Provide instructions
for sign- and zero-extending BImode predicates to integers.
(setcc_int<mode>): Remove previous (-1-based) instructions.
(cstorebi4): Remove BImode to SImode specific expander.
(cstore<mode>4): Fix indentation. Expand using setccsi_from_bi.
(cstore<mode>4): For both integer and floating point modes.
For VxWorks, replace an attempt at providing a posix API for
mkdir via macro by a varargs prototype, which works better for
C++ references like std::mkdir(arg1, arg2).
2021-12-16 Olivier Hainque <hainque@adacore.com>
fixincludes/
* inclhack.def (vxworks_posix_mkdir): Refine to expose a
varargs interface.
* tests/base/sys/stat.h: Update expected results.
* fixincl.x: Regenerate.
This change adjusts the processing of --sysroot to save the option in the
internal "switches" array, which lets self-specs test for it and provide a
default value possibly dependent on environment variables, as in
--with-specs=%{!-sysroot*:--sysroot=%:getenv("WIND_BASE" /target)}
2021-12-20 Olivier Hainque <hainque@adacore.com>
gcc/
* gcc.c (driver_handle_option): do_save --sysroot.
In the patch that implemented omp_get_device_num(), there was an error where
the stringification of GOMP_DEVICE_NUM_VAR, which is the macro expanding to
the actual symbol used, was erroneously using the STRINGX() macro in the
libgomp offload image symbol search, and expansion of the variable name
string through the additional layer of preprocessor symbol was not properly
achieved.
This patch fixes this by changing to properly use XSTRING(), also from
include/symcat.h.
libgomp/ChangeLog:
* plugin/plugin-gcn.c (GOMP_OFFLOAD_load_image): Change uses of STRINGX
into XSTRING when looking for GOMP_DEVICE_NUM_VAR in offload image.
* plugin/plugin-nvptx.c (GOMP_OFFLOAD_load_image): Likewise.
This generalizes the fix for PR103544 to also cover reductions that
are not reduction chains and does not consider reductions wrapped in
sign conversions for SLP reduction handling.
2022-01-04 Richard Biener <rguenther@suse.de>
PR tree-optimization/103864
PR tree-optimization/103544
* tree-vect-slp.c (vect_analyze_slp_instance): Exclude
reductions wrapped in conversions from SLP handling.
(vect_analyze_slp): Revert PR103544 change.
* gcc.dg/vect/pr103864.c: New testcase.
On Thu, Dec 30, 2021 at 04:08:25AM -0600, Segher Boessenkool wrote:
> > The following simple patch makes sure we call can_get_prologue even after
> > the last former iteration when vec is already empty and only break from
> > the loop afterwards (and only if the updating of pro done because of
> > !can_get_prologue didn't push anything into vec again).
During the development of the above patch I've noticed that in many cases
we call can_get_prologue often on the same pro again and again and again,
we can have many basic blocks pushed into vec and if most of those don't
require pro updates, i.e.
basic_block bb = vec.pop ();
if (!can_dup_for_shrink_wrapping (bb, pro, max_grow_size))
while (!dominated_by_p (CDI_DOMINATORS, bb, pro))
isn't true, then pro is can_get_prologue checked for each bb in the vec.
The following simple patch just remembers which bb we've verified already
and verifies again only when pro changes. Most of the patch is just
reindentation.
2022-01-04 Jakub Jelinek <jakub@redhat.com>
PR rtl-optimization/103860
* shrink-wrap.c (try_shrink_wrapping): Don't call can_get_prologue
uselessly for blocks for which it has been called already.
After the PR90030 patch, which removes the universal casting of all Fortran
array pointers to 'c_char*', a Fortran descriptor based array passed into an
affinity() clause now looks like:
- #pragma omp task private(i) shared(b) affinity(*(c_char *) a.data)
+ #pragma omp task private(i) shared(b) affinity(*(integer(kind=4)[0:] * restrict) a.data)
The 'integer(kind=4)[0:]' incomplete type appears to be causing ICE during
gimplify_expr() due to 'is_gimple_val, fb_rvalue'. The ICE appears to be fixed
just by adjusting to 'is_gimple_lvalue, fb_lvalue'. Considering the use of the
affinity() clause, which should be specifying the location of a particular
object in memory, this probably makes sense.
gcc/ChangeLog:
PR middle-end/103643
* gimplify.c (gimplify_omp_affinity): Adjust gimplify_expr of entire
OMP_CLAUSE_DECL to use 'is_gimple_lvalue, fb_lvalue'
gcc/testsuite/ChangeLog:
* gfortran.dg/gomp/pr103643.f90: New test.
This testcase was fixed by r12-1744-g3eecc1 as it make
sense it fixed a few other class deduction issues.
So I thought I would add a testcase for this PR and close
it as fixed.
Committed after a quick test of the testcase.
PR c++/90782
gcc/testsuite/ChangeLog:
* g++.dg/cpp1z/class-deduction100.C: New test.
It would be nice to handle language-specific codes in the tree
pretty-printer, but until then we can at least indent them appropriately.
gcc/ChangeLog:
* tree-pretty-print.c (do_niy): Add spc parameter.
(NIY): Pass it.
(print_call_name): Add spc local variable.
I'm tired of seeing
cp/parser.c:15923:55: warning: misspelled term 'decl' in format; use 'declaration' instead [-Wformat-diag]
cp/parser.c:15925:57: warning: misspelled term 'decl' in format; use 'declaration' instead [-Wformat-diag]
every time I compile cp/parser.c, which happens...a lot. I'd like my
compilation to be free of warnings, otherwise I'm going to miss some
important ones.
"decl-specifiers" is a C++ grammar term; it is not actual code, so
should not be wrapped with %< %>. I hope we can accept it as an exception
in check_tokens.
It was surrounded by %< %> in cp_parser_decl_specifier_seq, so fix that.
In passing, fix a misspelling in missspellings.
PR c++/103758
gcc/c-family/ChangeLog:
* c-format.c (check_tokens): Accept "decl-specifier*".
gcc/cp/ChangeLog:
* parser.c (cp_parser_decl_specifier_seq): Replace %<decl-specifier%>
with %qD.
gcc/testsuite/ChangeLog:
* g++.dg/cpp0x/constexpr-condition.C: Adjust dg-error.
Middle end tries to generate V4QImode moves to implement V2QImode inserts
and calls emit_move_multi_word when V4QImode moves are unavailable, as is
the case with 32-bit vector moves, constrainted with TARGET_SSE2.
However, this triggers
gcc_assert (mode_size >= UNITS_PER_WORD);
in emit_move_multi_word, since mode_size of V4QImode operand is less than
UNITS_PER_WORD of 64-bit targets.
The patch unconditionally enables 32-bit vector moves to match 16-bit
vector moves. This also enables implementation of 32-bit vector logic
operations with GPR in a follow-up patch.
2022-01-03 Uroš Bizjak <ubizjak@gmail.com>
gcc/ChangeLog:
PR target/103894
* config/i386/mmx.md (mov<V_32:mode>): Remove TARGET_SSE2 constraint.
(mov<V_32:mode>_internal): Ditto.
(*push<V_32:mode>_rex64): Ditto.
(movmisalign<V_32:mode>): Ditto.
(*push<V_32:mode>_rex64 splitter): Enable for
TARGET_64BIT && TARGET_SSE.
(*push<V_32:mode>2): Remove insn pattern.
gcc/testsuite/ChangeLog:
PR target/103894
* gcc.target/i386/pr103894.c: New test.
While cleaning up the bug database, I noticed there was a request
to improve the documentation of the _Complex type extensions.
So I rewrote part of the documentation to make things clearer on
__real/__imag and even added documentation about casts between
the scalar and the complex type.
I moved the documentation of __builtin_complex under this section
too because it makes more sense than having it in the other
built-in section and reference it.
OK? Built make info and make html and checked out the results to
make sure the tables look decent.
gcc/ChangeLog:
PR c/33193
* doc/extend.texi: Extend the documentation about Complex
types for casting and also rewrite the __real__/__imag__
expression portion to use tables.
Move __builtin_complex to the Complex type section.
The Fortran front end was generating invalid code for the array
copy-out after a call to a BIND(C) function for a dummy with the
CONTIGUOUS attribute when the actual argument was a call to the SHAPE
intrinsic or other array expressions that are not lvalues. It was
also generating code to evaluate the argument expression multiple
times on copy-in. This patch teaches it to recognize that a copy is
not needed in these cases.
2022-01-03 Sandra Loosemore <sandra@codesourcery.com>
PR fortran/103390
gcc/fortran/
* expr.c (gfc_is_simply_contiguous): Make it smarter about
function calls.
* trans-expr.c (gfc_conv_gfc_desc_to_cfi_desc): Do not generate
copy loops for array expressions that are not "variables" (lvalues).
gcc/testsuite/
* gfortran.dg/c-interop/pr103390-1.f90: New.
* gfortran.dg/c-interop/pr103390-2.f90: New.
* gfortran.dg/c-interop/pr103390-3.f90: New.
* gfortran.dg/c-interop/pr103390-4.f90: New.
* gfortran.dg/c-interop/pr103390-6.f90: New.
* gfortran.dg/c-interop/pr103390-7.f90: New.
* gfortran.dg/c-interop/pr103390-8.f90: New.
* gfortran.dg/c-interop/pr103390-9.f90: New.