These were provided by Dominik to check more of the corner case in our
memset/memcpy inline code.
gcc/testsuite/ChangeLog:
2017-01-05 Dominik Vogt <vogt@linux.vnet.ibm.com>
* gcc.target/s390/memcpy-2.c: New test.
* gcc.target/s390/memset-2.c: New test.
From-SVN: r244099
See the memset unrolling patch. The very same applies to memcpys with
constant lengths.
2017-01-05 Andreas Krebbel <krebbel@linux.vnet.ibm.com>
* config/s390/s390.c (s390_expand_movmem): Unroll MVC loop for
small constant length operands.
gcc/testsuite/ChangeLog:
2017-01-05 Andreas Krebbel <krebbel@linux.vnet.ibm.com>
* gcc.target/s390/memcpy-1.c: New test.
From-SVN: r244098
lengths.
When expanding a memset we emit a loop of MVCs/XCs instructions dealing
with 256 byte blocks. This loop used to get unrolled with older GCCs
when using constant length operands. GCC lost this ability probably
when more of the loop unrolling stuff has been moved to tree level.
With this patch the unrolling is done manually when emitting the RTL
insns.
2017-01-05 Andreas Krebbel <krebbel@linux.vnet.ibm.com>
* gcc.target/s390/memset-1.c: New test.
gcc/ChangeLog:
2017-01-05 Andreas Krebbel <krebbel@linux.vnet.ibm.com>
* config/s390/s390.c (s390_expand_setmem): Unroll the loop for
small constant length operands.
From-SVN: r244097
A memset with a value != 0 is currently implemented using the mvc
instruction propagating the first byte through 256 byte blocks. While
for the first mvc the byte is written with a separate instruction
subsequent MVCs used the last byte of the previous 256 byte block.
Starting with z13 this causes a major performance degradation. With
this patch we always set the first byte with an mvi or stc in order to
avoid the overlapping of the MVC operands between loop iterations.
On older machines this basically makes no measurable difference so the
patch enables the new behavior for all machine levels in order to make
sure that code built for older machine levels runs well when moved to
a z13.
Bootstrapped and regression tested on s390 and s390x using z900 and z13
as default -march level. No regressions.
gcc/ChangeLog:
2017-01-05 Andreas Krebbel <krebbel@linux.vnet.ibm.com>
* config/s390/s390.c (s390_expand_setmem): Avoid overlapping bytes
between loop iterations.
From-SVN: r244096
PR tree-optimizatin/78812
* rtl.h (contains_mem_rtx_p): Prototype.
* ifcvt.c (containts_mem_rtx_p): Move from here to...
* rtlanal.c (contains_mem_rtx_p): Here and remvoe static linkage.
* gcse.c (prune_expressions): Use contains_mem_rtx_p to discover
and prune MEMs that are not at the toplevel of a SET_SRC rtx. Look
through ZERO_EXTEND and SIGN_EXTEND when trying to avoid pruning
MEMs.
PR tree-optimization/78812
* g++.dg/torture/pr78812.C: New test.
From-SVN: r244093
Building with the bootstrap-O3 configuration option fails to compile
input.c due to an AFAICT false-positive warning about an uninitialized
use of a variable.
This patch adds a default initializer to silence it.
for gcc/ChangeLog
* input.c (assert_char_at_range): Default-initialize
actual_range.
From-SVN: r244091
This patch fixes a false-positive warning in df-scan, at bootstrap-O3
failed, and enables GCC to optimize out the code that leads to the
warning.
df_ref_create_structure was inlined into the else part of
df_ref_record. Due to the condition of the corresponding if, In the
else part, VRP deduced unsigned regno >= FIRST_PSEUDO_REGISTER.
In df_ref_create_structure, there's another regno variable,
initialized with the same expression and value as the caller's. GCC
can tell as much, but this regno variable is signed. It is used,
shifted right, to index a hard regset bit array within a path that
tests that this signed regno < FIRST_PSEUDO_REGISTER.
GCC warned about the possible out-of-range indexing into the hard
regset array. It shouldn't, after all, the same regno can't possibly
be both < FIRST_PSEUDO_REGISTER and >= FIRST_PSEUDO_REGISTER, can it?
Well, the optimizers correctly decide it could, if it was a negative
int that, when converted to unsigned, became larger than
FIRST_PSEUDO_REGISTER. But GCC doesn't know regno can't be negative,
so the test could not be optimize out. What's more, given the
constraints, VRP correctly concluded the hard regset array would
always be indexed by a value way outside the array index range.
This patch changes the inlined regno to unsigned, like the caller's,
so that we can now tell the conditions can't both hold, so we optimize
out the path containing the would-be out-of-range array indexing.
for gcc/ChangeLog
* df-scan.c (df_ref_create_structure): Make regno unsigned,
to match the caller.
From-SVN: r244090
A debug insn after the final jump of a basic block may cause the
expander to emit a dummy move where the non-debug compile won't
because it finds the jump insn at the end of the insn stream.
Fix the condition so that, instead of requiring the jump as the last
insn, it also matches a jump followed by debug insns.
This fixes the compilation of libgcc/libgcov-profiler.c with
-fcompare-debug on i686-linux-gnu.
for gcc/ChangeLog
* cfgexpand.c (expand_gimple_basic_block): Disregard debug
insns after final jump in test to emit dummy move.
From-SVN: r244089
Various Ada RTS files failed -fcompare-debug compilation because debug
stmts prevented EH cleanups from taking place. Adjusting
cleanup_empty_eh to skip them fixes it.
for gcc/ChangeLog
* gimple-iterator.h (gsi_one_nondebug_before_end_p): New.
* tree-eh.c (cleanup_empty_eh): Skip more debug stmts.
From-SVN: r244088
Building with the bootstrap-O3 configuration option fails to compile
fortran/module.c due to an AFAICT false-positive warning about an
uninitialized use of a variable.
This patch adds a dummy initializer to silence it.
for gcc/fortran/ChangeLog
* module.c (load_omp_udrs): Initialize name.
From-SVN: r244087
Building with the bootstrap-O1 configuration option fails to compile a
number of files due to AFAICT false-positive warnings about uses of
uninitialized variables.
This patch adds dummy initializers to silence them all.
for gcc/ChangeLog
* multiple_target.c (create_dispatcher_calls): Init e_next.
* tree-ssa-loop-split.c (split_loop): Init border.
* tree-vect-loop.c (vect_determine_vectorization_factor): Init
scalar_type.
From-SVN: r244086
[gcc]
2017-01-04 Michael Meissner <meissner@linux.vnet.ibm.com>
PR target/71977
PR target/70568
PR target/78823
* config/rs6000/predicates.md (sf_subreg_operand): New predicate.
(altivec_register_operand): Do not return true if the operand
contains a SUBREG mixing SImode and SFmode.
(vsx_register_operand): Likewise.
(vsx_reg_sfsubreg_ok): New predicate.
(vfloat_operand): Do not return true if the operand contains a
SUBREG mixing SImode and SFmode.
(vint_operand): Likewise.
(vlogical_operand): Likewise.
(gpc_reg_operand): Likewise.
(int_reg_operand): Likewise.
* config/rs6000/rs6000-protos.h (valid_sf_si_move): Add
declaration.
* config/rs6000/rs6000.c (valid_sf_si_move): New function to
determine if a MOVSI or MOVSF operation contains SUBREGs that mix
SImode and SFmode.
(rs6000_emit_move_si_sf_subreg): New helper function.
(rs6000_emit_move): Call rs6000_emit_move_si_sf_subreg to possbily
fixup SUBREGs involving SImode and SFmode.
* config/rs6000/vsx.md (SFBOOL_*): New constants that are operand
numbers for the new peephole2 optimization.
(peephole2 for SFmode unions): New peephole2 to optimize cases in
the GLIBC math library that do AND/IOR/XOR operations on single
precision floating point.
* config/rs6000/rs6000.h (TARGET_NO_SF_SUBREG): New internal
target macros to say whether we need to avoid SUBREGs mixing
SImode and SFmode.
(TARGET_ALLOW_SF_SUBREG): Likewise.
* config/rs6000/rs6000.md (UNSPEC_SF_FROM_SI): New unspecs.
(UNSPEC_SI_FROM_SF): Likewise.
(iorxor): Change spacing.
(and_ior_xor): New iterator for AND, IOR, and XOR.
(movsi_from_sf): New insns for SImode/SFmode SUBREG support.
(movdi_from_sf_zero_ext): Likewise.
(mov<mode>_hardfloat, FMOVE32 iterator): Use register_operand
instead of gpc_reg_operand. Add SImode/SFmode SUBREG support.
(movsf_from_si): New insn for SImode/SFmode SUBREG support.
(fma<mode>4): Use gpc_reg_operand instead of register_operand.
(fms<mode>4): Likewise.
(fnma<mode>4): Likewise.
(fnms<mode>4): Likewise.
(nfma<mode>4): Likewise.
(nfms<mode>4): Likewise.
[gcc/testsuite]
2017-01-04 Michael Meissner <meissner@linux.vnet.ibm.com>
PR target/71977
PR target/70568
PR target/78823
* gcc.target/powerpc/pr71977-1.c: New tests to check whether on
64-bit VSX systems with direct move, whether we optimize common
code sequences in the GLIBC math library for float math functions.
* gcc.target/powerpc/pr71977-2.c: Likewise.
From-SVN: r244084
PR c++/64767
* c.opt (Wpointer-compare): New option.
* c-parser.c (c_parser_postfix_expression): Mark zero character
constants by setting original_type in c_expr.
* c-typeck.c (parser_build_binary_op): Warn when a pointer is compared
with a zero character constant.
(char_type_p): New function.
* typeck.c (cp_build_binary_op): Warn when a pointer is compared with
a zero character literal.
* doc/invoke.texi: Document -Wpointer-compare.
* c-c++-common/Wpointer-compare-1.c: New test.
From-SVN: r244076
PR c++/78949
* typeck.c (cp_build_unary_op): Call mark_rvalue_use on arg if it has
vector type.
* c-c++-common/Wunused-var-16.c: New test.
From-SVN: r244075
PR c++/78693
* parser.c (cp_parser_simple_declaration): Only complain about
inconsistent auto deduction if auto_result doesn't use auto.
* g++.dg/cpp0x/pr78693.C: New test.
From-SVN: r244074
* parser.c (cp_parser_simple_declaration): Diagnose function
declaration among more than one init-declarators with auto
specifier.
* g++.dg/cpp1y/auto-fn34.C: New test.
From-SVN: r244071
PR c++/71182
* parser.c (cp_lexer_previous_token): Use vec_safe_address in the
assertion, as lexer->buffer may be NULL.
* g++.dg/cpp0x/pr71182.C: New test.
From-SVN: r244070
* dwarf2out.c (output_loc_list): Don't throw away 64K+ location
descriptions for -gdwarf-5 and emit them as uleb128 instead of
2-byte data.
From-SVN: r244069
gcc/testsuite/ChangeLog:
2017-01-04 Kelvin Nilsen <kelvin@gcc.gnu.org>
PR target/78056
* gcc.target/powerpc/pr78056-1.c: New test.
* gcc.target/powerpc/pr78056-2.c: New test.
* gcc.target/powerpc/pr78056-3.c: New test.
* gcc.target/powerpc/pr78056-4.c: New test.
* gcc.target/powerpc/pr78056-5.c: New test.
* gcc.target/powerpc/pr78056-6.c: New test.
* gcc.target/powerpc/pr78056-7.c: New test.
* gcc.target/powerpc/pr78056-8.c: New test.
* lib/target-supports.exp
(check_effective_target_powerpc_popcntb_ok): New procedure to test
whether the effective target supports the popcntb instruction.
gcc/ChangeLog:
2017-01-04 Kelvin Nilsen <kelvin@gcc.gnu.org>
PR target/78056
* doc/sourcebuild.texi (PowerPC-specific attributes): Add
documentation of the powerpc_popcntb_ok attribute.
* config/rs6000/rs6000.c (rs6000_option_override_internal): Add
code to issue warning messages if a requested CPU configuration is
not supported by the binary (assembler and loader) toolchain.
(spe_init_builtins): Add two assertions to prevent ICE if attempt is
made to define a built-in function that has been disabled.
(paired_init_builtins): Add assertion to prevent ICE if attempt is
made to define a built-in function that has been disabled.
(altivec_init_builtins): Add comment explaining why definition
of the DST built-in functions is not preceded by an assertion
check. Add assertions to prevent ICE if attempts are made to
define an altivec predicate or an abs* built-in function that has
been disabled.
(htm_init_builtins): Add comment explaining why definition of the
htm built-in functions is not preceded by an assertion check.
From-SVN: r244068
PR tree-optimizatin/67955
* tree-ssa-alias.c (same_addr_size_stores_p): Check offsets first.
Allow any SSA_VAR_P as the base objects. Use integer_zerop. Verify
the points-to solution does not include pt_null. Use DECL_PT_UID
unconditionally.
PR tree-optimization/67955
* gcc.dg/tree-ssa/ssa-dse-28.c: New test.
From-SVN: r244067
gcc/c/ChangeLog:
* c-parser.c (c_parser_declaration_or_fndef): Create a
rich_location at init_loc and parse it to start_init.
(last_init_list_comma): New global.
(c_parser_braced_init): Update last_init_list_comma when parsing
commas. Pass it to pop_init_level. Pass location of closing
brace to pop_init_level.
(c_parser_postfix_expression_after_paren_type): Create a
rich_location at type_loc and parse it to start_init.
(c_parser_omp_declare_reduction): Likewise for loc.
* c-tree.h (start_init): Add rich_location * param.
(pop_init_level): Add location_t param.
* c-typeck.c (struct initializer_stack): Add field
"missing_brace_richloc".
(start_init): Add richloc param, use it to initialize
the stack node's missing_brace_richloc.
(last_init_list_comma): New decl.
(finish_implicit_inits): Pass last_init_list_comma to
pop_init_level.
(push_init_level): When finding missing open braces, add fix-it
hints to the richloc.
(pop_init_level): Add "insert_before" param and pass it
when calling pop_init_level. Add fixits about missing
close braces to any richloc. Use the richloc for the
-Wmissing-braces warning.
(set_designator): Pass last_init_list_comma to pop_init_level.
(process_init_element): Likewise.
gcc/testsuite/ChangeLog:
* gcc.dg/Wmissing-braces-fixits.c: New test case.
From-SVN: r244061
The MIPS sfp-machine.h has an _FP_CHOOSENAN implementation which
emulates hardware semantics of not preserving signaling NaN payloads
for an operation with two NaN arguments (although that doesn't suffice
to avoid sNaN payload preservation in any case with just one NaN
argument).
However, those are only hardware semantics in the legacy NaN case; in
the NAN2008 case, the architecture documentation says hardware
preserves payloads in such cases. Furthermore, this implementation
assumes legacy NaN semantics, so in the NAN2008 case the
implementation actually has the effect of preserving sNaN payloads but
not preserving qNaN payloads, when both should be preserved.
This patch fixes the code just to copy from the first argument (at the
level of libgcc, it's not meaningful which argument is the first and
which is the second).
Tested for mips64-linux-gnu (soft float, NAN2008) with the glibc math/
tests.
* config/mips/sfp-machine.h (_FP_CHOOSENAN): Always preserve NaN
payload if [__mips_nan2008].
From-SVN: r244059
include/
* dwarf2.def (DW_OP_AARCH64_operation): Reserve the number 0xea.
(DW_CFA_GNU_window_save): Comments the multiplexing on AArch64.
Co-Authored-By: Jiong Wang <jiong.wang@arm.com>
From-SVN: r244055
PR tree-optimization/71563
* match.pd: Simplify X << Y into X if Y is known to be 0 or
out of range value - has low bits known to be zero.
* gcc.dg/tree-ssa/pr71563.c: New test.
From-SVN: r244050
Also fix a stray changelog entry. Some of the regen here is due to
previous changes not being regenerated properly, in part due to the
missing configure dependencies.
* configure: Regenerate.
config/
* picflag.m4: Remove stray \xA0 in comment.
gcc/
* Makefile.in (aclocal_deps): Update and order as per aclocal.m4.
* configure: Regenerate.
* config.in: Regenerate.
libada/
* Makefile.in (configure_deps): Update and order as per
configure.ac sinclude.
* configure: Regenerate.
libgcc/
* Makefile.in (configure_deps): Update.
* configure: Regenerate.
libiberty/
* Makefile.in (configure_deps): Update.
* configure: Regenerate.
libitm/
* Makefile.in: Regenerate.
* testsuite/Makefile.in: Regenerate.
From-SVN: r244049
As r244011 had to be reverted, this change adds back the testcase
changes that are needed due to r244003.
2017-01-04 Janne Blomqvist <jb@gcc.gnu.org>
PR fortran/78534
PR fortran/78976
* gfortran.dg/dependency_49.f90: Change scan-tree-dump-times
due to gfc_trans_string_copy change to avoid -Wstringop-overflow.
* gfortran.dg/transfer_intrinsic_1.f90: Change
scan-tree-dump-times due to gfc_trans_string_copy change to
avoid -Wstringop-overflow.
From-SVN: r244048
PR bootstrap/77569
* input.c (ebcdic_execution_charset::on_error): Don't use strstr for
a substring of the message, but strcmp with the whole message. Ifdef
ENABLE_NLS, translate the message first using dgettext.
From-SVN: r244047
PR tree-optimizatin/78856
* tree-ssa-threadupdate.c: Include tree-vectorizer.h.
(mark_threaded_blocks): Remove code to truncate thread paths that
cross multiple loop headers. Instead invalidate the cached loop
iteration information and handle case of a thread path walking
into an irreducible region.
PR tree-optimization/78856
* gcc.c-torture/execute/pr78856.c: New test.
From-SVN: r244045
[gcc]
2016-12-30 Michael Meissner <meissner@linux.vnet.ibm.com>
PR target/78900
* config/rs6000/rs6000.c (rs6000_split_signbit): Change some
assertions. Add support for doing the signbit if the IEEE 128-bit
floating point value is in a GPR.
* config/rs6000/rs6000.md (Fsignbit): Delete.
(signbit<mode>2_dm): Delete using <Fsignbit> and just use "wa".
Update the length attribute if the value is in a GPR.
(signbit<mode>2_dm_<su>ext): Add combiner pattern to eliminate
the sign or zero extension instruction, since the value is always
0/1.
(signbit<mode>2_dm2): Delete using <Fsignbit>.
2017-01-03 Michael Meissner <meissner@linux.vnet.ibm.com>
PR target/78953
* config/rs6000/vsx.md (vsx_extract_<mode>_store_p9): If we are
extracting SImode to a GPR register so that we can generate a
store, limit the vector to be in a traditional Altivec register
for the vextuwrx instruction.
[gcc/testsuite]
2017-01-03 Michael Meissner <meissner@linux.vnet.ibm.com>
PR target/78953
* gcc.target/powerpc/pr78953.c: New test.
From-SVN: r244044
gcc/:
* godump.c (go_format_type): Treat ENUMERAL_TYPE like
INTEGER_TYPE.
gcc/testsuite/:
* gcc.misc-tests/godump-1.c: Update for accurate representation of
enums.
From-SVN: r244041