diagnostic.h has a couple of macros (diagnostic_last_module_changed
and diagnostic_set_last_module) which are only used within
diagnostic_report_current_module.
This patch eliminates the macros in favor of static functions within
diagnostic.c.
No functional change intended.
gcc/ChangeLog:
* diagnostic.c (last_module_changed_p): New function.
(set_last_module): New function.
(diagnostic_report_current_module): Convert macro usage to
the above functions.
* diagnostic.h (diagnostic_context::last_module): Strengthen
from const line_map * to const line_map_ordinary *.
(diagnostic_last_module_changed): Delete macro.
(diagnostic_set_last_module): Delete macro.
From-SVN: r247664
This patch simplifies diagnostic_report_diagnostic by moving
option-printing to a new subroutine.
Doing so required a slight rewrite. In both the old and new
code, context->option_name returns a malloc-ed string.
The old behavior was to then use ACONCAT to manipulate the
format_spec, appending the option metadata.
ACONCAT calcs the buffer size, then uses alloca, and then copies the
data to the on-stack buffer.
Given the alloca, this needs rewriting when moving the printing to
a subroutine. In the new version, the metadata is simply printed
using pp_* calls (so it's hitting the obstack within the
pretty_printer).
This means we can get rid of the save/restore of format_spec: I don't
believe anything else in the code modifies it.
It also seems inherently simpler; it seems odd to me to be
appending metadata to the formatting string, rather than simply
printing the metadata after the formatted string is printed
(the old code also assumed that no option name contained a '%').
No functional change intended.
gcc/ChangeLog:
* diagnostic.c (diagnostic_report_diagnostic): Eliminate
save/restor of format_spec. Move option-printing code to...
(print_option_information): ...this new function, and
reimplement by simply printing to the pretty_printer,
rather than appending to the format string.
From-SVN: r247661
This patch simplifies diagnostic_report_diagnostic by moving the
pragma-handling logic into a subroutine.
No functional change intended.
gcc/ChangeLog:
* diagnostic.c (diagnostic_report_diagnostic): Split out pragma
handling logic into...
(update_effective_level_from_pragmas): ...this new function.
From-SVN: r247660
The RISC-V user ISA permits misaligned accesses, but they may trap
and be emulated. That emulation software needs to be compiled assuming
strict alignment.
Even when strict alignment is not required, set SLOW_UNALIGNED_ACCESS
based upon -mtune to avoid a performance pitfall.
gcc/ChangeLog:
2017-05-04 Andrew Waterman <andrew@sifive.com>
* config/riscv/riscv.opt (mstrict-align): New option.
* config/riscv/riscv.h (STRICT_ALIGNMENT): Use it. Update comment.
(SLOW_UNALIGNED_ACCESS): Define.
(riscv_slow_unaligned_access): Declare.
* config/riscv/riscv.c (riscv_tune_info): Add slow_unaligned_access
field.
(riscv_slow_unaligned_access): New variable.
(rocket_tune_info): Set slow_unaligned_access to true.
(optimize_size_tune_info): Set slow_unaligned_access to false.
(riscv_cpu_info_table): Add entry for optimize_size_tune_info.
(riscv_valid_lo_sum_p): Use TARGET_STRICT_ALIGN.
(riscv_option_override): Set riscv_slow_unaligned_access.
* doc/invoke.texi: Add -mstrict-align to RISC-V.
From-SVN: r247659
[gcc]
2017-05-05 Michael Meissner <meissner@linux.vnet.ibm.com>
PR target/79038
PR target/79202
PR target/79203
* config/rs6000/rs6000.md (u code attribute): Add FIX and
UNSIGNED_FIX.
(extendsi<mode>2): Add support for doing sign extension via
VUPKHSW and XXPERMDI if the value is in Altivec registers and we
don't have ISA 3.0 instructions.
(extendsi<mode>2 splitter): Likewise.
(fix_trunc<mode>si2): If we are at ISA 2.07 (VSX small integer),
generate the normal insns since SImode can now go in vector
registers. Disallow the special UNSPECs needed for previous
machines to hide SImode being used. Add new insns
fctiw{,w}_<mode>_smallint if SImode can go in vector registers.
(fix_trunc<mode>si2_stfiwx): Likewise.
(fix_trunc<mode>si2_internal): Likewise.
(fixuns_trunc<mode>si2): Likewise.
(fixuns_trunc<mode>si2_stfiwx): Likewise.
(fctiw<u>z_<mode>_smallint): Likewise.
(fctiw<u>z_<mode>_mem): New combiner pattern to prevent conversion
of floating point to 32-bit integer from doing a direct move to
the GPR registers to do a store.
(fctiwz_<mode>): Break long line.
[gcc/testsuite]
2017-05-05 Michael Meissner <meissner@linux.vnet.ibm.com>
PR target/79038
PR target/79202
PR target/79203
* gcc.target/powerpc/ppc-round3.c: New test.
* gcc.target/powerpc/ppc-round2.c: Update expected code.
From-SVN: r247657
* cp-tree.h (IDENTIFIER_GLOBAL_VALUE): Use get_namespace_value.
(SET_IDENTIFIER_GLOBAL_VALUE): Use set_global_value.
(IDENTIFIER_NAMESPACE_VALUE): Delete.
* name-lookup.h (namespace_binding, set_namespace_binding):
Replace
with ...
(get_namespace_value, set_global_value): ... these.
(get_global_value_if_present, is_typename_at_global_scope):
Delete.
* decl.c (poplevel): Use get_namespace_value.
(grokdeclarator): Use IDENTIFIER_GLOBAL_VALUE.
* class.c (build_vtbl_initializer): Stash library decl in
static var. Use IDENTIFIER_GLOBAL_VALUE.
* except.c (do_get_exception_ptr, do_begin_catch, do_end_catch)
do_allocate_exception, do_free_exception, build_throw): Likewise.
* init.c (throw_bad_array_new_length): Likewise.
* rtti.c (throw_bad_cast, throw_bad_typeid): Likewise.
* name-lookup.c (arg_assoc_namespace, pushdecl_maybe_friend_1)
check_for_our_of_scope_variable, push_overloaded_decl_1): Use
get_namespace_value.
(set_namespace_binding_1): Rename to
(set_namespace_binding): ... here.
(set_global_value): New.
(lookup_name_innermost_nonclass_level_1, push_namespace): Use
get_namespace_value.
* pt.c (listify): Use get_namespace_value.
((--This line, and those below, will be ignored--
M cp/name-lookup.c
M cp/name-lookup.h
M cp/ChangeLog
M cp/except.c
M cp/class.c
M cp/pt.c
M cp/init.c
M cp/cp-tree.h
M cp/decl.c
M cp/rtti.c
From-SVN: r247654
2017-05-05 Steve Ellcey <sellcey@cavium.com>
* doc/invoke.texi (-fopt-info): Explicitly say order of options
included in -fopt-info does not matter.
* doc/optinfo.texi (-fopt-info): Fix description of default
behavour. Explicitly say order of options included in -fopt-info
does not matter.
From-SVN: r247648
2017-05-05 Thomas Preud'homme <thomas.preudhomme@arm.com>
gcc/
* config.gcc: Allow combinations of aprofile and rmprofile values for
--with-multilib-list.
* config/arm/t-multilib: New file.
* config/arm/t-aprofile: Remove initialization of MULTILIB_*
variables. Remove setting of ISA and floating-point ABI in
MULTILIB_OPTIONS and MULTILIB_DIRNAMES. Set architecture and FPU in
MULTI_ARCH_OPTS_A and MULTI_ARCH_DIRS_A rather than MULTILIB_OPTIONS
and MULTILIB_DIRNAMES respectively. Add comment to introduce all
matches. Add architecture matches for marvel-pj4 and generic-armv7-a
CPU options.
* config/arm/t-rmprofile: Likewise except for the matches changes.
* doc/install.texi (--with-multilib-list): Document the combination of
aprofile and rmprofile values and warn about pitfalls in doing that.
From-SVN: r247646
Float to int moves currently generate inefficient code due to
hacks used in the movsi and movdi patterns. The 'r = w' variant
uses '*' which tells the register allocator to ignore it.
As a result the float to int moves typically spill to the stack,
which is extremely inefficient.
gcc/
* config/aarch64/aarch64.md (movsi_aarch64): Remove '*' from r=w.
(movdi_aarch64): Likewise.
From-SVN: r247643
PR tree-optimization/80632
* tree-switch-conversion.c (struct switch_conv_info): Add target_vop
field.
(build_arrays): Initialize it for virtual phis.
(fix_phi_nodes): Use it for virtual phis.
* gcc.dg/pr80632.c: New test.
From-SVN: r247642
PR tree-optimization/80558
* tree-vrp.c (extract_range_from_binary_expr_1): Optimize
[x, y] op z into [x op, y op z] for op & or | if conditions
are met.
* gcc.dg/tree-ssa/vrp115.c: New test.
From-SVN: r247641
Code scheduling for Cortex-A53 isn't as good as it could be. It turns out
code runs faster overall if we place loads and stores with a dependency
closer together. To achieve this effect, this patch adds a bypass between
cortex_a53_load1 and cortex_a53_load*/cortex_a53_store* if the result of an
earlier load is used in an address calculation. This significantly improved
benchmark scores in a proprietary benchmark suite.
gcc/
* config/arm/aarch-common.c (arm_early_load_addr_dep_ptr):
New function.
(arm_early_store_addr_dep_ptr): Likewise.
* config/arm/aarch-common-protos.h
(arm_early_load_addr_dep_ptr): Add prototype.
(arm_early_store_addr_dep_ptr): Likewise.
* config/arm/cortex-a53.md: Add new bypasses.
From-SVN: r247631
* tree.c (next_type_uid): Change type to unsigned.
(type_hash_canon): Decrement back next_type_uid if
freeing a type node with the highest TYPE_UID. For INTEGER_TYPEs
also ggc_free TYPE_MIN_VALUE, TYPE_MAX_VALUE and TYPE_CACHED_VALUES
if possible.
From-SVN: r247628
Many supported cores use the AUTOPREFETCHER_WEAK setting which tries
to order loads and stores to improve streaming performance. Since significant
gains were reported in http://patchwork.ozlabs.org/patch/534469/ it seems
like a good idea to enable this setting too for -mcpu=generic. Since the
weak model only keeps the order if it doesn't make the schedule worse, it
should not impact performance adversely on cores that don't show a gain.
gcc/
* config/aarch64/aarch64.c (generic_tunings): Update prefetch model.
From-SVN: r247610
Set jump alignment to 4 for Cortex cores as it reduces codesize by 0.4% on
average with no obvious performance difference. See original discussion of
the overheads of various alignments:
https://gcc.gnu.org/ml/gcc-patches/2016-06/msg02075.html.
gcc/
* config/aarch64/aarch64.c (cortexa35_tunings): Set jump alignment to 4.
(cortexa53_tunings): Likewise.
(cortexa57_tunings): Likewise.
(cortexa72_tunings): Likewise.
(cortexa73_tunings): Likewise.
From-SVN: r247609
With -mcpu=generic the loop alignment is currently 4. All but one of the
supported cores use 8 or higher. Since using 8 provides performance gains
on several cores, it is best to use that by default. As discussed in [1],
the jump alignment has no effect on performance, yet has a relatively high
codesize cost [2], so setting it to 4 is best. This gives a 0.2% overall
codesize improvement as well as performance gains in several benchmarks.
gcc/
* config/aarch64/aarch64.c (generic_tunings): Set jump alignment to 4.
Set loop alignment to 8.
[1] https://gcc.gnu.org/ml/gcc-patches/2017-04/msg00574.html
[2] https://gcc.gnu.org/ml/gcc-patches/2016-06/msg02075.html
From-SVN: r247608
All cores which add a cpu_addrcost_table use a non-zero value for
HI and TI mode shifts (a non-zero value for general indexing also
applies to all shifts). Given this, it makes no sense to use a
different setting in generic_addrcost_table. So change it so that
all supported cores, including -mcpu=generic, now generate the same:
int f(short *p, short *q, long x) { return p[x] + q[x]; }
lsl x2, x2, 1
ldrsh w3, [x0, x2]
ldrsh w0, [x1, x2]
add w0, w3, w0
ret
gcc/
* config/aarch64/aarch64.c (generic_addrcost_table):
Change HI/TI mode setting.
From-SVN: r247606
2017-05-04 Martin Jambor <mjambor@suse.cz>
PR tree-optimization/80622
* tree-sra.c (comes_initialized_p): New function.
(build_accesses_from_assign): Only set write lazily when
comes_initialized_p is false.
(analyze_access_subtree): Use comes_initialized_p.
(propagate_subaccesses_across_link): Assert !comes_initialized_p
instead of testing for PARM_DECL.
testsuite/
* gcc.dg/tree-ssa/pr80622.c: New test.
From-SVN: r247604
2017-05-04 Richard Biener <rguenther@suse.de>
* tree-ssa-alias.c (get_continuation_for_phi): Improve looking
for the last VUSE which def dominates the PHI. Directly call
maybe_skip_until.
(get_continuation_for_phi_1): Remove.
* gcc.dg/tree-ssa/ssa-fre-58.c: New testcase.
From-SVN: r247596
For the reasons explained in PR77536, niter_for_unrolled_loop assumes 5
iterations in the absence of profiling information, although it doesn't
increase beyond the estimate for the original loop. This left a hole in
which the new estimate could be less than the old one but still greater
than the limit imposed by CEIL (nb_iterations_upper_bound, unroll factor).
2017-05-04 Richard Sandiford <richard.sandiford@linaro.org>
gcc/
* tree-ssa-loop-manip.c (niter_for_unrolled_loop): Add commentary
to explain the use of truncating division. Cap the number of
iterations to the maximum given by nb_iterations_upper_bound,
if defined.
gcc/testsuite/
* gcc.dg/vect/vect-profile-1.c: New test.
From-SVN: r247591