This enables types __int128 et al for move, add, subtract, and logical
operations. At least shift, rotate, multiple, divide, and modulus are broken
so we can expect some test failures.
This is required now because libgomp no longer builds without __int128.
An additional patch will be required to unbreak the libgfortran build.
gcc/ChangeLog:
* config/gcn/gcn.c (gcn_scalar_mode_supported_p): New function.
(TARGET_SCALAR_MODE_SUPPORTED_P): New define.
gcc/ChangeLog:
2020-07-24 David Edelsohn <dje.gcc@gmail.com>
Clement Chigot <clement.chigot@atos.net>
* config.gcc (powerpc-ibm-aix7.1): Use t-aix64 and biarch64 for
cpu_is_64bit.
* config/rs6000/aix71.h (ASM_SPEC): Remove aix64 option.
(ASM_SPEC32): New.
(ASM_SPEC64): New.
(ASM_CPU_SPEC): Remove vsx and altivec options.
(CPP_SPEC_COMMON): Rename from CPP_SPEC.
(CPP_SPEC32): New.
(CPP_SPEC64): New.
(CPLUSPLUS_CPP_SPEC): Rename to CPLUSPLUS_CPP_SPEC_COMMON..
(TARGET_DEFAULT): Use 64 bit mask if BIARCH.
(LIB_SPEC_COMMON): Rename from LIB_SPEC.
(LIB_SPEC32): New.
(LIB_SPEC64): New.
(LINK_SPEC_COMMON): Rename from LINK_SPEC.
(LINK_SPEC32): New.
(LINK_SPEC64): New.
(STARTFILE_SPEC): Add 64 bit version of crtcxa and crtdbase.
(ASM_SPEC): Define 32 and 64 bit alternatives using DEFAULT_ARCH64_P.
(CPP_SPEC): Same.
(CPLUSPLUS_CPP_SPEC): Same.
(LIB_SPEC): Same.
(LINK_SPEC): Same.
(SUBTARGET_EXTRA_SPECS): Add new 32/64 specs.
* config/rs6000/aix72.h (TARGET_DEFAULT): Use 64 bit mask if BIARCH.
* config/rs6000/defaultaix64.h: Delete.
The only way to enable or disable Power10 insns (ISA 3.1 insns) should
be via the -mcpu= switch. This patch disables the -mpower10 options the
same way the -mdirect-move switch is neutered already. That is not an
ideal way, but it works, it is not the first, and doing it properly is
more work, and will happen later.
2020-07-24 Segher Boessenkool <segher@kernel.crashing.org>
* config/rs6000/rs6000.opt: Delete -mpower10.
gcc/testsuite/
* gcc.target/powerpc/pr95907.c: New.
There's no reason anyone would want to use the "patchable function"
feature for MMIX and also no reason to exclude those tests. For MMIX,
the NOP equivalent is SWYM ("swymming" is a healthy exercise).
Text-wise, making the tests pass by adjusting the regexp, is shorter,
and it seems unlikely to both appear as a mnemonic for other targets
*and* being emitted in uppercase.
gcc/testsuite:
* c-c++-common/patchable_function_entry-decl.c,
c-c++-common/patchable_function_entry-default.c,
c-c++-common/patchable_function_entry-definition.c: Adjust for mmix.
This test case, extracted from PR 95645, was failing because alignment
of local long long variable got lowered from 8 bytes to 4 bytes in
adjust alignment pass, which triggered assert failure.
This test case passes now because PR 95237 fix only allows lowering of
alignment of local variables in the front end. As a result, alignment
of local long long variable no longer gets lowered in adjust alignment
pass.
gcc/testsuite/ChangeLog:
PR target/96192
* c-c++-common/pr96192-1.c: New test.
Offload tests that scan dump files may run multiple times, once per
offload target, but the test result messages do not mention the
offload target, so we may seem to have repeated results. Fixed by
modifying the test name so that it contains the offload target name.
Co-Authored-By: Thomas Schwinge <thomas@codesourcery.com>
for gcc/testsuite/ChangeLog
* lib/scanoffload.exp (scoff-testname, scoff-adjust): New.
(scoff): Call them.
Rework intelmic-mkoffload into the new aux and dump file naming
semantics. Obey -save-temps.
Co-Authored-By: Thomas Schwinge <thomas@codesourcery.com>
for gcc/ChangeLog
* config/i386/intelmic-mkoffload.c
(generate_target_descr_file): Use dumppfx for save_temps
files. Pass -dumpbase et al down to the compiler.
(generate_target_offloadend_file): Likewise.
(generate_host_descr_file): Likewise.
(prepare_target_image): Likewise. Move out_obj_filename
setting...
(main): ... here. Detect -dumpbase, set dumppfx too.
The initial bug report was that compiling (-c) with -dumpbase ""
-dumpbase-ext .<ext> crashes the driver.
The verification of -dumpbase-ext against -dumpbase doesn't cover the
case in which -dumpbase activates backward-compatibility mode.
I added a test for that, and for -dumpbase-ext without -dumpbase,
trying to make it work in a sensible way, as if applied to the default
-dumpbase for each file. It turned out that this made for too much
complexity in dealing with suffixes derived from input filenames, so I
gave that up and returned to discarding -dumpbase-ext as documented,
ending up with a change identical to that in the original bug report.
I also thought I caught an off-by-one error in the initial
verification, that caused dumpbase_ext to be discarded if it was
identical to the specified dumpbase, but that turned out to be
intentional as well, so I put in comments and a test to reflect it.
Finally, an earlier version of the newly-added tests used "$var.ext"
in an expected output list, which showed me the handling of string
expansion was incorrect. Reworked the expr into an eval to make that
work, and, absent any reliance on post-eval adjustments to so-expanded
output names, I arranged for the adjustments to be skipped after eval.
Co-Authored-By: "Zhanghaijian (A)" <z.zhanghaijian@huawei.com>
for gcc/ChangeLog
PR driver/96230
* gcc.c (process_command): Adjust and document conditions to
reset dumpbase_ext.
for gcc/testsuite/ChangeLog
PR driver/96230
* gcc.misc-tests/outputs.exp: Add tests with -dumpbase-ext,
with identical -dumpbase, with -dumpbase "", and without any
-dumpbase.
(outest): Fix "" expansion in expected outputs, skip
adjustments.
The testglue object file gets interpreted as another input file,
changing the dump and aux output names in GCC unless it is protected
by -Wl, like board file-named extra inputs.
Refactor the code that modifies the board settings so that it can be
used to modify regular variables as well, and do so.
for gcc/testsuite/ChangeLog
PR testsuite/95720
* lib/gcc-defs.exp (gcc_adjust_linker_flags_list): Split out of...
(gcc_adjust_linker_flags): ... this. Protect gluefile and
wrap_flags.
* gcc.misc-tests/outputs.exp: Use gcc_adjust_linker_flags_list.
The switch between FMT_E and FMT_F is based on the absolute value.
Set r=0 for rounding toward zero and r = 1 otherwise.
If (exp_d - m) == 1 there is no rounding needed.
libgfortran/ChangeLog:
PR fortran/93567
* io/write_float.def (determine_en_precision): Fix switch between
FMT_E and FMT_F.
gcc/testsuite/ChangeLog:
PR fortran/93567
* gfortran.dg/round_3.f08: Add test cases.
The fix is obvious (I have added a comment). The tests are probably
an overkill, but it does not hurt.
libgfortran/ChangeLog:
PR fortran/93592
* io/write_float.def (build_float_string): Do not reset
nbefore for FMT_F and FMT_EN.
gcc/testsuite/ChangeLog:
PR fortran/93592
* gfortran.dg/fmt_en.f90: Adjust test.
* gfortran.dg/fmt_en_rd.f90: New test.
* gfortran.dg/fmt_en_rn.f90: New test.
* gfortran.dg/fmt_en_ru.f90: New test.
* gfortran.dg/fmt_en_rz.f90: New test.
We correctly reject this testcase since r11-434, i.e. since the fix for
PR c++/57943.
gcc/testsuite/ChangeLog:
PR c++/81339
* g++.dg/cpp0x/decltype78.C: New test.
..., so that we don't leak this into '*.exp' files running later.
This is relevant after commit efc16503ca10bc0e934e0bace5777500e4dc757a "handle
dumpbase in offloading, adjust testsuite" -- I was confused why in a
(simplified) testing sequence as follows:
default 'libgomp.c/c.exp'
default 'libgomp.oacc-c/c.exp'
'-m32' 'libgomp.c/c.exp'
'-m32' 'libgomp.oacc-c/c.exp'
..., the "'-m32' 'libgomp.c/c.exp'" variant would not execute any offloading
dump scanning. The reason is that the "default 'libgomp.oacc-c/c.exp'" variant
ends with 'offload_target=disable' set, so that's what the "'-m32'
'libgomp.c/c.exp'" variant would then see, in particular
'gcc/testsuite/lib/scanoffload.exp:scoff'.
libgomp/
* testsuite/libgomp.oacc-c++/c++.exp: Unset 'offload_target' after
use.
* testsuite/libgomp.oacc-c/c.exp: Likewise.
* testsuite/libgomp.oacc-fortran/fortran.exp: Likewise.
_ITM_beginTransaction is a 'returns_twice' function that saves x30
on the stack as part of gtm_jmpbuf (that is passed down to
GTM_begin_transaction), but the saved x30 is also used for return.
The return path should be protected so we don't leave an
ldp x29, x30, [sp]
ret
gadget in the code, so x30 is signed on function entry. This
exposes the signed address in the gtm_jmpbuf too. The jmpbuf does
not need a signed address since GTM_longjmp uses
ldp x29, x30, [x1]
br x30
and with BTI there is a BTI j at the _ITM_beginTransaction call site
where this jump returns. Using PAC does not hurt: the gtm_jmpbuf is
internal to libitm and its layout is only used by sjlj.S so the
signed address does not escape. Saving signed x30 into gtm_jmpbuf
provides a bit of extra protection, but more importantly it allows
adding the PAC-RET support without changing the existing code much.
In theory bti and pac-ret protection can be added unconditionally
since the instructions are in the nop space, in practice they
can cause trouble if some tooling does not understand the gnu
property note (e.g. old binutils) or some unwinder or debugger
does not understand the new dwarf op code used for pac-ret (e.g
old gdb). So the code is written to only support branch-protection
according to the code generation options.
libitm/ChangeLog:
* config/aarch64/sjlj.S: Add conditional pac-ret protection.
This note is not used anywhere currently but it is supposed to mark
objects if the return address is protected with PAC on the stack.
Since lse.S only has leaf functions the return address is never
saved on the stack so we can add the note.
The note is only added if pac-ret is enabled because it can cause
problems with old linkers and we don't have checks for that. This
can be changed later to be unconditional, for now it is consistent
with how gcc generates the notes.
libgcc/ChangeLog:
* config/aarch64/lse.S: Add PAC property note.
Since gcc.target/i386/memcpy-pr95886.c requires 64-bit register, restrict
it to !ia32.
PR middle-end/95886
* gcc.target/i386/memcpy-pr95886.c: Restrict test to !ia32.
AIX-style libraries contains both 32 and 64 bit shared objects.
This patch follows the adding of FAT libraries support in other gcc
libraries (libgcc, listdc++, etc).
Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/242957
The call to gomp_detach_pointer in gomp_unmap_vars_internal does not
need to force finalization, and doing so may mask mismatched pointer
attachments/detachments. This patch removes the forcing.
2020-07-16 Julian Brown <julian@codesourcery.com>
Thomas Schwinge <thomas@codesourcery.com>
libgomp/
* target.c (gomp_unmap_vars_internal): Remove unnecessary forcing of
finalization for detach operation.
* testsuite/libgomp.oacc-c-c++-common/structured-detach-underflow.c:
New test.
Co-Authored-By: Thomas Schwinge <thomas@codesourcery.com>
2020-07-23 Roger Sayle <roger@nextmovesoftware.com>
gcc/ChangeLog
PR rtl-optimization/96298
* simplify-rtx.c (simplify_binary_operation_1) [XOR]: Xor doesn't
distribute over xor, so (a^b)^(c^b) is not the same as (a^c)^b.
Currently this script doesn't set the indentation style for the standard
library headers under libstdc++/ because they lack a file extension.
But they do have a modeline, so the file type is still set appropriately
by Vim. So by inspecting &filetype, we can also detect these standard
library headers as C-like files.
contrib/ChangeLog:
* vimrc (SetStyle): Also inspect &filetype to determine whether
a file is C-like.
This commit adds CUDA_Execute and CUDA_Global to the list of allowed
pragmas. It also implements basic validation of said pragmas.
gcc/ada/
* aspects.ads: Declare CUDA_Global as aspect.
* einfo.ads: Use Flag118 for the Is_CUDA_Kernel flag.
(Set_Is_CUDA_Kernel): New function.
(Is_CUDA_Kernel): New function.
* einfo.adb (Set_Is_CUDA_Kernel): New function.
(Is_CUDA_Kernel): New function.
* par-prag.adb (Prag): Ignore Pragma_CUDA_Execute and
Pragma_CUDA_global.
* rtsfind.ads: Define CUDA.Driver_Types.Stream_T and
CUDA.Vector_Types.Dim3 entities
* rtsfind.adb: Define CUDA_Descendant subtype.
(Get_Unit_Name): Handle CUDA_Descendant packages.
* sem_prag.ads: Mark CUDA_Global as aspect-specifying pragma.
* sem_prag.adb (Analyze_Pragma): Validate Pragma_CUDA_Execute and
Pragma_CUDA_Global.
* snames.ads-tmpl: Define Name_CUDA_Execute and Name_CUDA_Global.
Access values should never designate unaliased components.
This new feature is documented in AI12-0027-1.
gcc/ada/
* sem_ch13.ads (Same_Representation): Renamed as
Has_Compatible_Representation because now the order of the arguments
are taken into account; its formals are also renamed as Target_Type
and Operand_Type.
* sem_ch13.adb (Same_Representation): Renamed and moved to place the
routine in alphabetic order.
* sem_attr.adb (Prefix_With_Safe_Accessibility_Level): New subprogram.
(Resolve_Attribute): Check that the prefix of attribute Access
does not have a value conversion of an array type.
* sem_res.adb (Resolve_Actuals): Remove restrictive check on view
conversions which required matching value of Has_Aliased_Components of
formals and actuals.
* exp_ch4.adb (Handle_Changed_Representation): Update call to
Same_Representation.
(Expand_N_Type_Conversion): Update call to Same_Representation.
* exp_ch5.adb (Change_Of_Representation): Update call to
Same_Representation.
* exp_ch6.adb (Add_Call_By_Copy_Code): Update call to
Same_Representation.
(Expand_Actuals): Update call to Same_Representation.
(Expand_Call_Helper): Update call to Same_Representation.
Add the capability to use the Write_* procedures in an environment where
you want to write debugging info but still use them to write to other
files (such a C source files).
gcc/ada/
* output.ads (Push_Output, Pop_Output): New procedures.
* output.adb (FD_Array, FD_Stack, FD_Stack_Idx): New type and vars.
(Push_Output, Pop_Output): New procedures.
This patch is to rename the existing function adjust_vectorization_cost
to rs6000_adjust_vect_cost_per_stmt, to avoid some confusion.
gcc/ChangeLog:
* config/rs6000/rs6000.c (adjust_vectorization_cost): Renamed to ...
(rs6000_adjust_vect_cost_per_stmt): ... here.
(rs6000_add_stmt_cost): Rename adjust_vectorization_cost to
rs6000_adjust_vect_cost_per_stmt.
This patch is to handle vector with length internal functions
IFN_LEN_LOAD and IFN_LEN_STORE in IVOPTS.
gcc/ChangeLog:
* tree-ssa-loop-ivopts.c (get_mem_type_for_internal_fn): Handle
IFN_LEN_LOAD and IFN_LEN_STORE.
(get_alias_ptr_type_for_ptr_address): Likewise.
gcc/fortran/ChangeLog:
* gfortran.texi (Standards): Update URL; state that OpenMP 4.5
is supported and 5.0 is partially.
* intrinsic.texi (OpenMP Modules): Refer also to OpenMP 5.0;
(OMP_LIB): Add missing derived type and new named constants.
- Most KASAN function don't need any porting anything in back-end
except asan stack protection.
- However kernel will given shadow offset when enable asan stack
protection, so eveything in KASAN can work if shadow offset is given.
- Verified with x86 and risc-v.
- Verified with RISC-V linux kernel.
gcc/ChangeLog:
PR target/96260
* asan.c (asan_shadow_offset_set_p): New.
* asan.h (asan_shadow_offset_set_p): Ditto.
* toplev.c (process_options): Allow -fsanitize=kernel-address
even TARGET_ASAN_SHADOW_OFFSET not implemented, only check when
asan stack protection is enabled.
gcc/testsuite/ChangeLog:
PR target/96260
* gcc.target/riscv/pr91441.c: Update warning message.
* gcc.target/riscv/pr96260.c: New.
Another missed attribute-visibility-requirement, causing a failure for
e.g. mmix-knuth-mmixware. Committed as obvious.
gcc/testsuite:
* c-c++-common/builtin-has-attribute-4.c: Require visibility.
LWG recently decided it should be ill-formed to instantiate std::future
and std::shared_future for types that can't be returned from a function.
This adds static assertions to enforce it (std::future already failed,
but this makes the error more understandable).
LWG 3466 extends that to std::promise. The actual constraint is that
t.~T() is well-formed for the primary template, but rejecting arrays and
functions as done for futures matches that condition.
libstdc++-v3/ChangeLog:
* include/std/future (future, shared_future, promise): Add
static assertions to the primary template to reject array and
function types.
* testsuite/30_threads/future/requirements/lwg3458.cc: New test.
* testsuite/30_threads/promise/requirements/lwg3466.cc: New test.
* testsuite/30_threads/shared_future/requirements/lwg3458.cc: New test.
PR96236 shows a problem where we don't correctly store our 512-bit accumulators
correctly in little-endian mode. The patch below detects when we're doing a
little-endian memory access and stores to the correct memory locations.
2020-07-22 Peter Bergner <bergner@linux.ibm.com>
gcc/
PR target/96236
* config/rs6000/rs6000-call.c (rs6000_gimple_fold_mma_builtin): Handle
little-endian memory ordering.
gcc/testsuite/
PR target/96236
* gcc.target/powerpc/mma-double-test.c: Update storing results for
correct little-endian ordering.
* gcc.target/powerpc/mma-single-test.c: Likewise.
We don't need to add CONST_DECLs to a template decl's decl list. Also made the
code flow a bit clearer.
gcc/cp/
* class.c (maybe_add_class_template_decl_list): Don't add CONST_DECLs.
I discovered the dump machinery would get confused by filenames containing '-'.
Fixed thusly.
gcc/
* dumpfile.c (parse_dump_option): Deal with filenames
containing '-'
I had to debug structural_comptypes, and its complex if conditions and
tail calling of same_type_p made that hard. I'd hope we can turn the
eqivalent of return boolean_fn () ? true : false; into a tail call of
the boolean. We also were not dealing with TYPEOF_TYPE.
gcc/cp/
* typeck.c (structural_comptypes): [DECLTYPE_TYPE] break
apart complex if.
[UNDERLYING_TYPE]: Use an if.
[TYPEOF_TYPE]: New.
Here are some more places where we can declare variables at the
assignment point, rather than use C89. Also, let's name our variables
by what they contain -- the register allocator is perfectly able to
track liveness for us.
gcc/cp/
* decl.c (decls_match): Move variables into scopes
they're needed in.
(duplicate_decls): Use STRIP_TEMPLATE.
(build_typename_type): Move var decls to their assignments.
(begin_function_body): Likewise.
* decl2.c (get_guard): Likewise.
(mark_used): Use true for truthiness.
* error.c (dump_aggr_type): Hold the decl in a var called
'decl', not 'name'.
I noticed the default capture mode and the discriminator both used
ints. That seems excessive. This shrinks them to 8 bits and 16 bits
respectively. I suppose the discriminator could use the remaining 24
bits of an int allocation unit, if we're worried about more that 64K
lambdas per function. I know, users are strange :) On a 64 bit system
this saves 64 bits, because we also had 32 bits of padding added.
gcc/cp/
* cp-tree.h (struct tree_lambda_expr): Shrink
default_capture_mode & discriminator.