introduce overridable clear_cache emitter

This patch introduces maybe_emit_call_builtin___clear_cache for the
builtin expander machinery and the trampoline initializers to use to
clear the instruction cache, removing a source of inconsistencies and
subtle errors in low-level machinery.

I've adjusted all trampoline_init implementations that used to issue
explicit calls to __clear_cache or similar to use this new primitive.


Specifically on vxworks targets, we needed to drop the __clear_cache
symbol in libgcc, for reasons related with linking that I didn't need
to understand, and we wanted to call cacheTextUpdate directly, despite
the different calling conventions: the second argument is a length
rather than the end address.

So I introduced a target hook to enable target OS-level overriding of
builtin __clear_cache call emission, retaining nearly (*) the same
logic to govern the decision on whether to emit a call (or nothing, or
a machine-dependent insn) but enabling a call to a target
system-defined function with different calling conventions to be
issued, without having to modify .md files of the various
architectures supported by the target system to introduce or modify
clear_cache insns.

(*) I write "nearly" mainly because, when not optimizing, we'd issue a
call regardless, but since the call may now be overridden, I added it
to the set of builtins that are not directly turned into calls when
not optimizing, following the normal expansion path instead.  It
wouldn't be hard to skip the emission of cache-clearing insns when not
optimizing, but it didn't seem very important, especially for the new
uses from trampoline init.

    Another difference that might be relevant is that now we expand
the begin and end arguments unconditionally.  This might make a
difference if they have side effects.  That's prettty much impossible
at expand time, but I thought I'd mention it.


I have NOT modified targets that did not issue cache-clearing calls in
trampoline init to use the new clear_cache-calling infrastructure even
if it would expand to nothing.  I have considered doing so, to have
__builtin___clear_cache and trampoline init call cacheTextUpdate on
all vxworks targets, but decided not to, since on targets that don't
do any cache clearing, cacheTextUpdate ought to be a no-op, even
though rs6000 seems to use icbi and dcbf instructions in the function
called to initialize a trampoline, but AFAICT not in the __clear_cache
builtin.  Hopefully target maintainers will have a look and take
advantage of this new piece of infrastructure to remove such
(apparent?) inconsistencies.  Not rs6000 and other that call asm-coded
trampoline setup instructions, for sure, but they might wish to
introduce a CLEAR_INSN_CACHE macro or a clear_cache expander if they
don't have one.


for  gcc/ChangeLog

	* builtins.c (default_emit_call_builtin___clear_cache): New.
	(maybe_emit_call_builtin___clear_cache): New.
	(expand_builtin___clear_cache): Split into the above.
	(expand_builtin): Do not issue clear_cache call any more.
	* builtins.h (maybe_emit_call_builtin___clear_cache): Declare.
	* config/aarch64/aarch64.c (aarch64_trampoline_init): Use
	maybe_emit_call_builtin___clear_cache.
	* config/arc/arc.c (arc_trampoline_init): Likewise.
	* config/arm/arm.c (arm_trampoline_init): Likewise.
	* config/c6x/c6x.c (c6x_initialize_trampoline): Likewise.
	* config/csky/csky.c (csky_trampoline_init): Likewise.
	* config/m68k/linux.h (FInALIZE_TRAMPOLINE): Likewise.
	* config/tilegx/tilegx.c (tilegx_trampoline_init): Likewise.
	* config/tilepro/tilepro.c (tilepro_trampoline_init): Ditto.
	* config/vxworks.c: Include rtl.h, memmodel.h, and optabs.h.
	(vxworks_emit_call_builtin___clear_cache): New.
	* config/vxworks.h (CLEAR_INSN_CACHE): Drop.
	(TARGET_EMIT_CALL_BUILTIN___CLEAR_CACHE): Define.
	* target.def (trampoline_init): In the documentation, refer to
	maybe_emit_call_builtin___clear_cache.
	(emit_call_builtin___clear_cache): New.
	* doc/tm.texi.in: Add new hook point.
	(CLEAR_CACHE_INSN): Remove duplicate 'both'.
	* doc/tm.texi: Rebuilt.
	* targhooks.h (default_meit_call_builtin___clear_cache):
	Declare.
	* tree.h (BUILTIN_ASM_NAME_PTR): New.

for  libgcc/ChangeLog

	* config/t-vxworks (LIB2ADD): Drop.
	* config/t-vxworks7 (LIB2ADD): Likewise.
	* config/vxcache.c: Remove.
This commit is contained in:
Alexandre Oliva 2020-12-02 22:10:32 -03:00 committed by Alexandre Oliva
parent 93d883c773
commit c05ece92c6
20 changed files with 159 additions and 103 deletions

View File

@ -7770,26 +7770,63 @@ expand_builtin_copysign (tree exp, rtx target, rtx subtarget)
return expand_copysign (op0, op1, target);
}
/* Expand a call to __builtin___clear_cache. */
/* Emit a call to __builtin___clear_cache. */
static rtx
expand_builtin___clear_cache (tree exp)
void
default_emit_call_builtin___clear_cache (rtx begin, rtx end)
{
if (!targetm.code_for_clear_cache)
rtx callee = gen_rtx_SYMBOL_REF (Pmode,
BUILTIN_ASM_NAME_PTR
(BUILT_IN_CLEAR_CACHE));
emit_library_call (callee,
LCT_NORMAL, VOIDmode,
begin, ptr_mode,
end, ptr_mode);
}
/* Emit a call to __builtin___clear_cache, unless the target specifies
it as do-nothing. This function can be used by trampoline
finalizers to duplicate the effects of expanding a call to the
clear_cache builtin. */
void
maybe_emit_call_builtin___clear_cache (rtx begin, rtx end)
{
if (GET_MODE (begin) != ptr_mode || GET_MODE (end) != ptr_mode)
{
#ifdef CLEAR_INSN_CACHE
/* There is no "clear_cache" insn, and __clear_cache() in libgcc
does something. Just do the default expansion to a call to
__clear_cache(). */
return NULL_RTX;
#else
error ("both arguments to %<__builtin___clear_cache%> must be pointers");
return;
}
if (targetm.have_clear_cache ())
{
/* We have a "clear_cache" insn, and it will handle everything. */
class expand_operand ops[2];
create_address_operand (&ops[0], begin);
create_address_operand (&ops[1], end);
if (maybe_expand_insn (targetm.code_for_clear_cache, 2, ops))
return;
}
else
{
#ifndef CLEAR_INSN_CACHE
/* There is no "clear_cache" insn, and __clear_cache() in libgcc
does nothing. There is no need to call it. Do nothing. */
return const0_rtx;
return;
#endif /* CLEAR_INSN_CACHE */
}
/* We have a "clear_cache" insn, and it will handle everything. */
targetm.calls.emit_call_builtin___clear_cache (begin, end);
}
/* Expand a call to __builtin___clear_cache. */
static void
expand_builtin___clear_cache (tree exp)
{
tree begin, end;
rtx begin_rtx, end_rtx;
@ -7799,25 +7836,16 @@ expand_builtin___clear_cache (tree exp)
if (!validate_arglist (exp, POINTER_TYPE, POINTER_TYPE, VOID_TYPE))
{
error ("both arguments to %<__builtin___clear_cache%> must be pointers");
return const0_rtx;
return;
}
if (targetm.have_clear_cache ())
{
class expand_operand ops[2];
begin = CALL_EXPR_ARG (exp, 0);
begin_rtx = expand_expr (begin, NULL_RTX, Pmode, EXPAND_NORMAL);
begin = CALL_EXPR_ARG (exp, 0);
begin_rtx = expand_expr (begin, NULL_RTX, Pmode, EXPAND_NORMAL);
end = CALL_EXPR_ARG (exp, 1);
end_rtx = expand_expr (end, NULL_RTX, Pmode, EXPAND_NORMAL);
end = CALL_EXPR_ARG (exp, 1);
end_rtx = expand_expr (end, NULL_RTX, Pmode, EXPAND_NORMAL);
create_address_operand (&ops[0], begin_rtx);
create_address_operand (&ops[1], end_rtx);
if (maybe_expand_insn (targetm.code_for_clear_cache, 2, ops))
return const0_rtx;
}
return const0_rtx;
maybe_emit_call_builtin___clear_cache (begin_rtx, end_rtx);
}
/* Given a trampoline address, make sure it satisfies TRAMPOLINE_ALIGNMENT. */
@ -9507,6 +9535,7 @@ expand_builtin (tree exp, rtx target, rtx subtarget, machine_mode mode,
&& fcode != BUILT_IN_EXECLE
&& fcode != BUILT_IN_EXECVP
&& fcode != BUILT_IN_EXECVE
&& fcode != BUILT_IN_CLEAR_CACHE
&& !ALLOCA_FUNCTION_CODE_P (fcode)
&& fcode != BUILT_IN_FREE)
return expand_call (exp, target, ignore);
@ -9696,10 +9725,8 @@ expand_builtin (tree exp, rtx target, rtx subtarget, machine_mode mode,
return expand_builtin_next_arg ();
case BUILT_IN_CLEAR_CACHE:
target = expand_builtin___clear_cache (exp);
if (target)
return target;
break;
expand_builtin___clear_cache (exp);
return const0_rtx;
case BUILT_IN_CLASSIFY_TYPE:
return expand_builtin_classify_type (exp);

View File

@ -128,6 +128,7 @@ extern tree fold_call_expr (location_t, tree, bool);
extern tree fold_builtin_call_array (location_t, tree, tree, int, tree *);
extern bool validate_gimple_arglist (const gcall *, ...);
extern rtx default_expand_builtin (tree, rtx, rtx, machine_mode, int);
extern void maybe_emit_call_builtin___clear_cache (rtx, rtx);
extern bool fold_builtin_next_arg (tree, bool);
extern tree do_mpc_arg2 (tree, tree, tree, int, int (*)(mpc_ptr, mpc_srcptr, mpc_srcptr, mpc_rnd_t));
extern tree fold_call_stmt (gcall *, bool);

View File

@ -11037,10 +11037,10 @@ aarch64_trampoline_init (rtx m_tramp, tree fndecl, rtx chain_value)
/* XXX We should really define a "clear_cache" pattern and use
gen_clear_cache(). */
a_tramp = XEXP (m_tramp, 0);
emit_library_call (gen_rtx_SYMBOL_REF (Pmode, "__clear_cache"),
LCT_NORMAL, VOIDmode, a_tramp, ptr_mode,
plus_constant (ptr_mode, a_tramp, TRAMPOLINE_SIZE),
ptr_mode);
maybe_emit_call_builtin___clear_cache (a_tramp,
plus_constant (ptr_mode,
a_tramp,
TRAMPOLINE_SIZE));
}
static unsigned char

View File

@ -4418,10 +4418,10 @@ arc_initialize_trampoline (rtx tramp, tree fndecl, rtx cxt)
GEN_INT (TRAMPOLINE_SIZE), BLOCK_OP_NORMAL);
emit_move_insn (adjust_address (tramp, SImode, 8), fnaddr);
emit_move_insn (adjust_address (tramp, SImode, 12), cxt);
emit_library_call (gen_rtx_SYMBOL_REF (Pmode, "__clear_cache"),
LCT_NORMAL, VOIDmode, XEXP (tramp, 0), Pmode,
plus_constant (Pmode, XEXP (tramp, 0), TRAMPOLINE_SIZE),
Pmode);
maybe_emit_call_builtin___clear_cache (XEXP (tramp, 0),
plus_constant (Pmode,
XEXP (tramp, 0),
TRAMPOLINE_SIZE));
}
/* Add the given function declaration to emit code in JLI section. */

View File

@ -4170,9 +4170,10 @@ arm_trampoline_init (rtx m_tramp, tree fndecl, rtx chain_value)
}
a_tramp = XEXP (m_tramp, 0);
emit_library_call (gen_rtx_SYMBOL_REF (Pmode, "__clear_cache"),
LCT_NORMAL, VOIDmode, a_tramp, Pmode,
plus_constant (Pmode, a_tramp, TRAMPOLINE_SIZE), Pmode);
maybe_emit_call_builtin___clear_cache (a_tramp,
plus_constant (ptr_mode,
a_tramp,
TRAMPOLINE_SIZE));
}
/* Thumb trampolines should be entered in thumb mode, so set

View File

@ -725,9 +725,10 @@ c6x_initialize_trampoline (rtx tramp, tree fndecl, rtx cxt)
}
#ifdef CLEAR_INSN_CACHE
tramp = XEXP (tramp, 0);
emit_library_call (gen_rtx_SYMBOL_REF (Pmode, "__gnu_clear_cache"),
LCT_NORMAL, VOIDmode, tramp, Pmode,
plus_constant (Pmode, tramp, TRAMPOLINE_SIZE), Pmode);
maybe_emit_call_builtin___clear_cache (tramp,
plus_constant (Pmode,
tramp,
TRAMPOLINE_SIZE));
#endif
}

View File

@ -5917,9 +5917,10 @@ csky_trampoline_init (rtx m_tramp, tree fndecl, rtx chain_value)
emit_move_insn (mem, fnaddr);
a_tramp = XEXP (m_tramp, 0);
emit_library_call (gen_rtx_SYMBOL_REF (Pmode, "__clear_cache"),
LCT_NORMAL, VOIDmode, a_tramp, Pmode,
plus_constant (Pmode, a_tramp, TRAMPOLINE_SIZE), Pmode);
maybe_emit_call_builtin___clear_cache (a_tramp,
plus_constant (Pmode,
a_tramp,
TRAMPOLINE_SIZE));
}

View File

@ -194,10 +194,10 @@ along with GCC; see the file COPYING3. If not see
#undef FINALIZE_TRAMPOLINE
#define FINALIZE_TRAMPOLINE(TRAMP) \
emit_library_call (gen_rtx_SYMBOL_REF (Pmode, "__clear_cache"), \
LCT_NORMAL, VOIDmode, TRAMP, Pmode, \
plus_constant (Pmode, TRAMP, TRAMPOLINE_SIZE), \
Pmode);
maybe_emit_call_builtin___clear_cache ((TRAMP), \
plus_constant (Pmode, \
(TRAMP), \
TRAMPOLINE_SIZE))
/* Clear the instruction cache from `beg' to `end'. This makes an
inline system call to SYS_cacheflush. The arguments are as

View File

@ -5049,9 +5049,7 @@ tilegx_trampoline_init (rtx m_tramp, tree fndecl, rtx static_chain)
end_addr = force_reg (Pmode, plus_constant (Pmode, XEXP (m_tramp, 0),
TRAMPOLINE_SIZE));
emit_library_call (gen_rtx_SYMBOL_REF (Pmode, "__clear_cache"),
LCT_NORMAL, VOIDmode, begin_addr, Pmode,
end_addr, Pmode);
maybe_emit_call_builtin___clear_cache (begin_addr, end_addr);
}

View File

@ -4458,9 +4458,7 @@ tilepro_trampoline_init (rtx m_tramp, tree fndecl, rtx static_chain)
end_addr = force_reg (Pmode, plus_constant (Pmode, XEXP (m_tramp, 0),
TRAMPOLINE_SIZE));
emit_library_call (gen_rtx_SYMBOL_REF (Pmode, "__clear_cache"),
LCT_NORMAL, VOIDmode, begin_addr, Pmode,
end_addr, Pmode);
maybe_emit_call_builtin___clear_cache (begin_addr, end_addr);
}

View File

@ -27,6 +27,9 @@ along with GCC; see the file COPYING3. If not see
#include "diagnostic-core.h"
#include "output.h"
#include "fold-const.h"
#include "rtl.h"
#include "memmodel.h"
#include "optabs.h"
#if !HAVE_INITFINI_ARRAY_SUPPORT
/* Like default_named_section_asm_out_constructor, except that even
@ -169,4 +172,25 @@ vxworks_override_options (void)
if (!global_options_set.x_dwarf_version)
dwarf_version = VXWORKS_DWARF_VERSION_DEFAULT;
}
/* We don't want to use library symbol __clear_cache on SR0640. Avoid
it and issue a direct call to cacheTextUpdate. It takes a size_t
length rather than the END address, so we have to compute it. */
void
vxworks_emit_call_builtin___clear_cache (rtx begin, rtx end)
{
/* STATUS cacheTextUpdate (void *, size_t); */
rtx callee = gen_rtx_SYMBOL_REF (Pmode, "cacheTextUpdate");
enum machine_mode size_mode = TYPE_MODE (sizetype);
rtx len = simplify_gen_binary (MINUS, size_mode, end, begin);
emit_library_call (callee,
LCT_NORMAL, VOIDmode,
begin, ptr_mode,
len, size_mode);
}

View File

@ -282,10 +282,13 @@ extern void vxworks_asm_out_destructor (rtx symbol, int priority);
/* The diab linker does not handle .gnu_attribute sections. */
#undef HAVE_AS_GNU_ATTRIBUTE
/* We provide our own version of __clear_cache in libgcc, using a separate C
file to facilitate #inclusion of VxWorks header files. */
#undef CLEAR_INSN_CACHE
#define CLEAR_INSN_CACHE 1
/* We call vxworks's cacheTextUpdate instead of CLEAR_INSN_CACHE if
needed. We don't want to force a call on targets that don't define
cache-clearing insns nor CLEAR_INSN_CACHE. */
#undef TARGET_EMIT_CALL_BUILTIN___CLEAR_CACHE
#define TARGET_EMIT_CALL_BUILTIN___CLEAR_CACHE \
vxworks_emit_call_builtin___clear_cache
extern void vxworks_emit_call_builtin___clear_cache (rtx begin, rtx end);
/* Default dwarf control values, for non-gdb debuggers that come with
VxWorks. */

View File

@ -5457,11 +5457,24 @@ Note that the block move need only cover the constant parts of the
trampoline. If the target isolates the variable parts of the trampoline
to the end, not all @code{TRAMPOLINE_SIZE} bytes need be copied.
If the target requires any other actions, such as flushing caches or
If the target requires any other actions, such as flushing caches
(possibly calling function maybe_emit_call_builtin___clear_cache) or
enabling stack execution, these actions should be performed after
initializing the trampoline proper.
@end deftypefn
@deftypefn {Target Hook} void TARGET_EMIT_CALL_BUILTIN___CLEAR_CACHE (rtx @var{begin}, rtx @var{end})
On targets that do not define a @code{clear_cache} insn expander,
but that define the @code{CLEAR_CACHE_INSN} macro,
maybe_emit_call_builtin___clear_cache relies on this target hook
to clear an address range in the instruction cache.
The default implementation calls the @code{__clear_cache} builtin,
taking the assembler name from the builtin declaration. Overriding
definitions may call alternate functions, with alternate calling
conventions, or emit alternate RTX to perform the job.
@end deftypefn
@deftypefn {Target Hook} rtx TARGET_TRAMPOLINE_ADJUST_ADDRESS (rtx @var{addr})
This hook should perform any machine-specific adjustment in
the address of the trampoline. Its argument contains the address of the
@ -5490,7 +5503,7 @@ the following macro.
If defined, expands to a C expression clearing the @emph{instruction
cache} in the specified interval. The definition of this macro would
typically be a series of @code{asm} statements. Both @var{beg} and
@var{end} are both pointer expressions.
@var{end} are pointer expressions.
@end defmac
To use a standard subroutine, define the following macro. In addition,

View File

@ -3877,6 +3877,8 @@ is used for aligning trampolines.
@hook TARGET_TRAMPOLINE_INIT
@hook TARGET_EMIT_CALL_BUILTIN___CLEAR_CACHE
@hook TARGET_TRAMPOLINE_ADJUST_ADDRESS
Implementing trampolines is difficult on many machines because they have
@ -3897,7 +3899,7 @@ the following macro.
If defined, expands to a C expression clearing the @emph{instruction
cache} in the specified interval. The definition of this macro would
typically be a series of @code{asm} statements. Both @var{beg} and
@var{end} are both pointer expressions.
@var{end} are pointer expressions.
@end defmac
To use a standard subroutine, define the following macro. In addition,

View File

@ -5166,12 +5166,28 @@ Note that the block move need only cover the constant parts of the\n\
trampoline. If the target isolates the variable parts of the trampoline\n\
to the end, not all @code{TRAMPOLINE_SIZE} bytes need be copied.\n\
\n\
If the target requires any other actions, such as flushing caches or\n\
If the target requires any other actions, such as flushing caches\n\
(possibly calling function maybe_emit_call_builtin___clear_cache) or\n\
enabling stack execution, these actions should be performed after\n\
initializing the trampoline proper.",
void, (rtx m_tramp, tree fndecl, rtx static_chain),
default_trampoline_init)
/* Emit a call to a function to clear the instruction cache. */
DEFHOOK
(emit_call_builtin___clear_cache,
"On targets that do not define a @code{clear_cache} insn expander,\n\
but that define the @code{CLEAR_CACHE_INSN} macro,\n\
maybe_emit_call_builtin___clear_cache relies on this target hook\n\
to clear an address range in the instruction cache.\n\
\n\
The default implementation calls the @code{__clear_cache} builtin,\n\
taking the assembler name from the builtin declaration. Overriding\n\
definitions may call alternate functions, with alternate calling\n\
conventions, or emit alternate RTX to perform the job.",
void, (rtx begin, rtx end),
default_emit_call_builtin___clear_cache)
/* Adjust the address of the trampoline in a target-specific way. */
DEFHOOK
(trampoline_adjust_address,

View File

@ -166,6 +166,7 @@ extern bool default_function_value_regno_p (const unsigned int);
extern rtx default_internal_arg_pointer (void);
extern rtx default_static_chain (const_tree, bool);
extern void default_trampoline_init (rtx, tree, rtx);
extern void default_emit_call_builtin___clear_cache (rtx, rtx);
extern poly_int64 default_return_pops_args (tree, tree, poly_int64);
extern reg_class_t default_ira_change_pseudo_allocno_class (int, reg_class_t,
reg_class_t);

View File

@ -5600,6 +5600,13 @@ is_lang_specific (const_tree t)
#define BUILTIN_VALID_P(FNCODE) \
(IN_RANGE ((int)FNCODE, ((int)BUILT_IN_NONE) + 1, ((int) END_BUILTINS) - 1))
/* Obtain a pointer to the identifier string holding the asm name for
BUILTIN, a BUILT_IN code. This is handy if the target
mangles/overrides the function name that implements the
builtin. */
#define BUILTIN_ASM_NAME_PTR(BUILTIN) \
(IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME (builtin_decl_explicit (BUILTIN))))
/* Return the tree node for an explicit standard builtin function or NULL. */
static inline tree
builtin_decl_explicit (enum built_in_function fncode)

View File

@ -4,7 +4,6 @@ LIBGCC2_DEBUG_CFLAGS =
# We provide our own implementation for __clear_cache, using a
# VxWorks specific entry point.
LIB2FUNCS_EXCLUDE += _clear_cache
LIB2ADD += $(srcdir)/config/vxcache.c
# This ensures that the correct target headers are used; some VxWorks
# system headers have names that collide with GCC's internal (host)

View File

@ -4,7 +4,6 @@ LIBGCC2_DEBUG_CFLAGS =
# We provide our own implementation for __clear_cache, using a
# VxWorks specific entry point.
LIB2FUNCS_EXCLUDE += _clear_cache
LIB2ADD += $(srcdir)/config/vxcache.c
# This ensures that the correct target headers are used; some VxWorks
# system headers have names that collide with GCC's internal (host)

View File

@ -1,35 +0,0 @@
/* Copyright (C) 2018-2020 Free Software Foundation, Inc.
Contributed by Alexandre Oliva <oliva@adacore.com>
This file is part of GCC.
GCC is free software; you can redistribute it and/or modify it under
the terms of the GNU General Public License as published by the Free
Software Foundation; either version 3, or (at your option) any later
version.
GCC is distributed in the hope that it will be useful, but WITHOUT ANY
WARRANTY; without even the implied warranty of MERCHANTABILITY or
FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
for more details.
Under Section 7 of GPL version 3, you are granted additional
permissions described in the GCC Runtime Library Exception, version
3.1, as published by the Free Software Foundation.
You should have received a copy of the GNU General Public License and
a copy of the GCC Runtime Library Exception along with this program;
see the files COPYING3 and COPYING.RUNTIME respectively. If not, see
<http://www.gnu.org/licenses/>. */
/* Instruction cache invalidation routine using VxWorks' cacheLib. */
#include <vxWorks.h>
#include <cacheLib.h>
void
__clear_cache (char *beg __attribute__((__unused__)),
char *end __attribute__((__unused__)))
{
cacheTextUpdate (beg, end - beg);
}