I am returning a TDF_* flag to the queue of available entries as I am
unconvinced that we need to burn an entire flag for internal debugging
constructs, especially since we seem to be running out of them.
I've added a --param=threader-debug entry similar to the one we use for
ranger debugging. Currently this only affects the backward threader,
but since the DOM threader is an outlier and on the chopping block, I
avoided using the "backward" name.
Tested on x86-64 Linux.
gcc/ChangeLog:
* dumpfile.c (dump_options): Remove TDF_THREADING entry.
* dumpfile.h (enum dump_flag): Remove TDF_THREADING and adjust
remaining entries.
* flag-types.h (enum threader_debug): New.
* gimple-range-path.cc (DEBUG_SOLVER): Use param_threader_debug.
* params.opt: Add entry for --param=threader-debug=.
In this patch:
+ Add `armv9-a` to -march.
+ Update multilib with armv9-a and armv9-a+simd.
gcc/ChangeLog:
* config/arm/arm-cpus.in (armv9): New define.
(ARMv9a): New group.
(armv9-a): New arch definition.
* config/arm/arm-tables.opt: Regenerate.
* config/arm/arm.h (BASE_ARCH_9A): New arch enum value.
* config/arm/t-aprofile: Added armv9-a and armv9+simd.
* config/arm/t-arm-elf: Added arm9-a, v9_fps and all_v9_archs
to MULTILIB_MATCHES.
* config/arm/t-multilib: Added v9_a_nosimd_variants and
v9_a_simd_variants to MULTILIB_MATCHES.
* doc/invoke.texi: Update docs.
gcc/testsuite/ChangeLog:
* gcc.target/arm/multilib.exp: Update test with armv9-a entries.
* lib/target-supports.exp (v9a): Add new armflag.
(__ARM_ARCH_9A__): Add new armdef.
My initial implementation of the method
ipa_param_body_adjustments::remap_with_debug_expressions was based on
the assumption that if it was asked to remap an expression (as opposed
to a simple SSA_NAME), the expression would not contain an SSA_NAME
operand which is to be debug-reset. While that is true for when
called from ipa_param_body_adjustments::prepare_debug_expressions, it
turns out it is not true when invoked from remap_gimple_stmt in
tree-inline.c. This patch adds a simple logic to handle such cases
and simply map the entire value to NULL_TREE in those cases.
gcc/ChangeLog:
2021-11-08 Martin Jambor <mjambor@suse.cz>
PR ipa/103132
* ipa-param-manipulation.c (replace_with_mapped_expr): Early
return with error_mark_mode when part of expression is mapped to
NULL.
(ipa_param_body_adjustments::remap_with_debug_expressions): Set
mapped value to NULL if walk_tree returns error_mark_mode.
gcc/testsuite/ChangeLog:
2021-11-08 Martin Jambor <mjambor@suse.cz>
PR ipa/103132
* gcc.dg/ipa/pr103132.c: New test.
gcc/ada/
* sem_ch4.adb (Analyze_Membership_Op) <Find_Interpretation>: Handle
both overloaded and non-overloaded cases.
<Try_One_Interp>: Do a reversed call to Covers if the outcome of the
call to Has_Compatible_Type is false.
Simplify implementation after change to Find_Interpretation.
(Analyze_User_Defined_Binary_Op): Be prepared for previous errors.
(Find_Comparison_Types) <Try_One_Interp>: Do a reversed call to
Covers if the outcome of the call to Has_Compatible_Type is false.
(Find_Equality_Types) <Try_One_Interp>: Likewise.
* sem_type.adb (Has_Compatible_Type): Remove the reversed calls to
Covers. Add explicit return on all paths.
gcc/ada/
* sprint.adb (Sprint_Node_Actual) <N_Allocator>: Also print the
Procedure_To_Call field if it is present.
<N_Extended_Return_Statement>: Also print the Storage_Pool and
Procedure_To_Call fields if they are present.
<N_Free_Statement>: Likewise.
<N_Simple_Return_Statement>: Likewise.
gcc/ada/
* strub.adb, strub.ads: New files.
* exp_attr.adb (Access_Cases): Copy strub mode to subprogram type.
* exp_disp.adb (Expand_Dispatching_Call): Likewise.
* freeze.adb (Check_Inherited_Conditions): Check that strub modes
match overridden subprograms and interfaces.
(Freeze_All): Renaming declarations too.
* sem_attr.adb (Resolve_Attribute): Reject 'Access to
strub-annotated data object.
* sem_ch3.adb (Derive_Subprogram): Copy strub mode to
inherited subprogram.
* sem_prag.adb (Analyze_Pragma): Propagate Strub Machine_Attribute
from access-to-subprogram to subprogram type when required,
but not from access-to-data to data type. Mark the entity that
got the pragma as having a gigi rep item.
* sem_res.adb (Resolve): Reject implicit conversions that
would change strub modes.
(Resolve_Type_Conversions): Reject checked conversions
between incompatible strub modes.
* doc/gnat_rm/security_hardening_features.rst: Update.
* gnat_rm.texi: Regenerate.
* libgnat/a-except.ads (Raise_Exception): Revert strub-callable
annotation in public subprogram.
* libgnat/s-arit128.ads (Multiply_With_Ovflo_Check128): Likewise.
* libgnat/s-arit64.ads (Multiply_With_Ovflo_Check64): Likewise.
* libgnat/s-secsta.ads (SS_Allocate): Likewise.
(SS_Mark, SS_Release): Likewise.
* gcc-interface/Make-lang.in (GNAT_ADA_OBJS): Add ada/strub.o.
gcc/ada/
* Makefile.rtl (ARM and Aarch64 VxWorks): Use atomic variants of
runtime units.
* libgnat/a-strunb__shared.ads: Mention AARCH64 and ARM as
supported.
* libgnat/s-atocou.ads: Likewise.
gcc/ada/
* vxworks7-cert-rtp-link.spec: Replace the definition of
__wrs_rtp_base with the base_link spec.
* vxworks7-cert-rtp-base-link.spec: Add base_link spec with
__wrs_rtp_base definition for all architectures.
* vxworks7-cert-rtp-base-link__ppc64.spec: Add base_link spec
with __wrs_rtp_base definition for ppc64.
* vxworks7-cert-rtp-base-link__x86.spec: Add base_link spec with
__wrs_rtp_base definition for x86.
* vxworks7-cert-rtp-base-link__x86_64.spec: Add base_link spec
with __wrs_rtp_base definition for x86_64.
gcc/ada/
* exp_ch8.adb (Build_Body_For_Renaming): Remove unnecessary
calls to Sloc; set Handled_Statement_Sequence when building
subprogram body; whitespace cleanup.
gcc/ada/
* libgnat/a-strunb.adb (Deallocate): Rename Reference_Copy to
Old, to make the code similar to other routines in this package.
(Realloc_For_Chunk): Use a temporary, deallocate the previous
string using a null-allowing copy of the string reference.
gcc/ada/
* sem_ch13.adb (Freeze_Entity_Checks): Analyze the expression of
a pragma Predicate associated with an aspect at the freeze point
of the type, to ensure that references to globals get saved when
the aspect occurs within a generic body. Also, add
Aspect_Static_Predicate to the choices of the membership test of
the enclosing guard.
gcc/ada/
* exp_ch4.adb (Arr_Attr): Refine type of the parameter from Int
to Pos; refine name of the parameter from Num to Dim; fix
reference to "Expr" in comment.
gcc/ada/
* libgnat/s-regexp.adb (Compile.Check_Well_Formed_Patern): When
a "|" operator is encountered in a pattern, check that it is not
the last character of the pattern.
gcc/ada/
* checks.adb (Apply_Constraint_Check): Guard against calling
Choices when the first association in an array aggregate is a
N_Iterated_Component_Association node.
gcc/ada/
* sem_prag.adb (Check_Usage): Guard against calling Usage_Error
with illegal Item_Id. The intention to do this was already
described in the comment but not implemented.
The following patch converts the strlen pass from evrp to ranger,
leaving DOM as the last remaining user.
No additional cleanups have been done. For example, the strlen pass
still has uses of VR_ANTI_RANGE, and the sprintf still passes around
pairs of integers instead of using a proper range. Fixing this
could further improve these passes.
Basically the entire patch is just adjusting the calls to range_of_expr
to include context. The previous context of si->stmt was mostly
empty, so not really useful ;-).
With ranger we are now able to remove the range calculation from
before_dom_children entirely. Just working with the ranger on-demand
catches all the strlen and sprintf testcases with the exception of
builtin-sprintf-warn-22.c which is due to a limitation of the sprintf
code. I have XFAILed the test and documented what the problem is.
On a positive note, these changes found two possible sprintf overflow
bugs in the C++ and Fortran front-ends which I have fixed below.
Tested on x86-64 Linux.
gcc/ChangeLog:
* tree-ssa-strlen.c (compare_nonzero_chars): Pass statement
context to ranger.
(get_addr_stridx): Same.
(get_stridx): Same.
(get_range_strlen_dynamic): Same.
(handle_builtin_strlen): Same.
(handle_builtin_strchr): Same.
(handle_builtin_strcpy): Same.
(maybe_diag_stxncpy_trunc): Same.
(handle_builtin_stxncpy_strncat): Same.
(handle_builtin_memcpy): Same.
(handle_builtin_strcat): Same.
(handle_alloc_call): Same.
(handle_builtin_memset): Same.
(handle_builtin_string_cmp): Same.
(handle_pointer_plus): Same.
(count_nonzero_bytes_addr): Same.
(count_nonzero_bytes): Same.
(handle_store): Same.
(fold_strstr_to_strncmp): Same.
(handle_integral_assign): Same.
(check_and_optimize_stmt): Same.
(class strlen_dom_walker): Replace evrp with ranger.
(strlen_dom_walker::before_dom_children): Remove evrp.
(strlen_dom_walker::after_dom_children): Remove evrp.
* gimple-ssa-warn-access.cc (maybe_check_access_sizes):
Restrict sprintf output.
gcc/cp/ChangeLog:
* ptree.c (cxx_print_xnode): Add more space to pfx array.
gcc/fortran/ChangeLog:
* misc.c (gfc_dummy_typename): Make sure ts->kind is
non-negative.
gcc/testsuite/ChangeLog:
* gcc.dg/tree-ssa/builtin-sprintf-warn-22.c: XFAIL.
According to
https://gcc.gnu.org/legacy-ml/gcc-patches/2008-03/msg01698.html, the
TLS support, including the __tls_lookup function, was added to VxWorks
in 6.6.
It certainly doesn't exist on our VxWorks 5 platform, but the fallback
code in eh_globals.cc using __gthread_key_create() etc. used to work
just fine.
libstdc++-v3/ChangeLog:
* config/os/vxworks/os_defines.h (_GLIBCXX_HAVE_TLS): Only
define for VxWorks >= 6.6.
The first issue is that the !gotoff_operand path of legitimize_pic_address
in the large PIC model does not make use of REG when it is available, which
breaks for thunks because new pseudo-registers can no longer be created.
And the second issue is that the system compiler (LLVM) generates @GOTOFF
in large model even for RTP, so we do the same.
gcc/
* config/i386/i386.c (legitimize_pic_address): Adjust comment and
use the REG argument on the CM_LARGE_PIC code path as well.
* config/i386/predicates.md (gotoff_operand): Do not treat VxWorks
specially with the large code models.
When using rangers private callback mechanism to provide context
to fold_stmt calls, we are only suppose to be using the cache in read
only mode, never calculate new values.
gcc/
PR tree-optimization/103122
* gimple-range.cc (gimple_ranger::range_of_expr): Request the cache
entry with "calulate new values" set to false.
gcc/testsuite/
* g++.dg/pr103122.C: New.
For nested functions we output call to builtin_dwarf_cfa which
initializes frame entry used only for debugging. This however
prevents us from detecting functions containing nested functions
as const/pure or analyze side effects in modref.
builtin_dwarf_cfa is not documented and I wonder if it should be turned to
internal function. But I think we could consider functions using it const even
if in theory one can do things like test the return address and see the
difference between different frame addreses.
While doing so I also noticed that special_buitin_state handles quite few
builtins that are not special cased by ipa-modref. They do not make
user visible loads/stores and thus I think they shoul dbe annotated by
".c" to make this explicit for both modref and PTA.
Finally I aded dwarf_cfa and similar return_address to list of simple
bulitins since it compiles to simple stack frame load (and we consider
simple other builtins doing so).
* builtins.c (is_simple_builtin): Add builitin_dwarf_cfa
and builtin_return_address.
(builtin_fnspec): Annotate builtin_return,
bulitin_eh_pointer, builtin_eh_filter, builtin_unwind_resume,
builtin_cxa_end_cleanup, builtin_eh_copy_values,
builtin_frame_address, builtin_apply_args,
builtin_asan_before_dynamic_init, builtin_asan_after_dynamic_init,
builtin_prefetch, builtin_dwarf_cfa, builtin_return_addrss
as ".c"
* ipa-pure-const.c (special_builtin_state): Add builtin_dwarf_cfa
and builtin_return_address.
moveS uncprop after modref and pure/const pass and adds a comment that
this pass should alwasy be last since it is only supposed to help PHI lowering.
The pass replaces constant by SSA names that are known to be constant at the
place which hardly helps other passes.
gcc/ChangeLog:
PR tree-optimization/103177
* passes.def: Move uncprop after pure/const and modref.
My recent patch to improve debug experience when there are removed
parameters (by ipa-sra or ipa-split) was not careful to unshare the
expressions that were then put into debug statements, which manifests
itself as PR 103099. This patch adds unsharing them using
unshare_expr_without_location which is a bit more careful with stripping
locations than what we were doing manually and so also fixes PR 103107.
gcc/ChangeLog:
2021-11-08 Martin Jambor <mjambor@suse.cz>
PR ipa/103099
PR ipa/103107
* tree-inline.c (remap_gimple_stmt): Unshare the expression without
location before invoking remap_with_debug_expressions on it.
* ipa-param-manipulation.c
(ipa_param_body_adjustments::prepare_debug_expressions): Likewise.
gcc/testsuite/ChangeLog:
2021-11-08 Martin Jambor <mjambor@suse.cz>
PR ipa/103099
PR ipa/103107
* g++.dg/ipa/pr103099.C: New test.
* gcc.dg/ipa/pr103107.c: Likewise.
The vsx_splat_v4si_di pattern uses a Power8 and a Power9 instruction.
The final condition of TARGET_DIRECT_MODE_64BIT implicitly requires Power8.
The "we" constraint requires Power9, but also requires 64 bit. Because
the DImode pattern already requires 64 bit mode, this isn't horrible,
but it would be best to remove all uses of "we" constraint. The
mtvsrws instruction itself does not require 64 bit mode.
This patch reverts the previous change to fix the breakage.
gcc/ChangeLog:
* config/rs6000/vsx.md (vsx_splat_v4si_di): Revert "wa"
constraint to "we".
The sbitmap bitmap_{set,clear}_bit changes trigger spurious
uninit value use reportings from valgrind since we now
read the old value before setting/clearing a bit so
verify_loop_structures optimization to not clear the sbitmap is reported.
Fixed by using a temporary BB flag which should also be more
efficient in terms of cache re-use.
2021-11-08 Richard Biener <rguenther@suse.de>
* cfgloop.c (verify_loop_structure): Use a temporary BB flag
instead of an sbitmap to cache irreducible state.
The problem here is an ordering issue with a path that starts
with 19->3:
<bb 3> [local count: 916928331]:
# value_20 = PHI <value_17(19), value_7(D)(17)>
# n_27 = PHI <n_16(19), 1(17)>
n_16 = n_27 + 4;
value_17 = value_20 / 10000;
if (value_20 > 42949672959999)
goto <bb 19>; [89.00%]
else
goto <bb 4>; [11.00%]
The problem here is that both value_17 and value_20 are in the set of
imports we must pre-calculate. The value_17 name occurs first in the
bitmap, so we try to resolve it first, which causes us to recursively
solve the value_20 range. We do so correctly and put them both in the
cache. However, when we try to solve value_20 from the bitmap, we
ignore that it already has a cached entry and try to resolve the PHI
with the wrong value of value_17:
# value_20 = PHI <value_17(19), value_7(D)(17)>
The right thing to do is to avoid recalculating definitions already
solved.
Regstrapped and checked for # threads before and after on x86-64 Linux.
gcc/ChangeLog:
PR tree-optimization/103120
* gimple-range-path.cc (path_range_query::range_defined_in_block):
Bail if there's a cache entry.
gcc/testsuite/ChangeLog:
* gcc.dg/pr103120.c: New test.
There are a few leftover places where we use the old rs6000_builtins_decl
array, but we need to use rs6000_builtins_decl_x instead when the new
builtins infrastructure is in play.
2021-11-08 Bill Schmidt <wschmidt@linux.ibm.com>
gcc/
* config/rs6000/rs6000.c (rs6000_builtin_reciprocal): Use
rs6000_builtin_decls_x when appropriate.
(add_condition_to_bb): Likewise.
(rs6000_atomic_assign_expand_fenv): Likewise.