So far z16 was identified as arch14. After the machine has been
announced we can now add the real name.
gcc/ChangeLog:
* common/config/s390/s390-common.cc: Rename PF_ARCH14 to PF_Z16.
* config.gcc: Add z16 as march/mtune switch.
* config/s390/driver-native.cc (s390_host_detect_local_cpu):
Recognize z16 with -march=native.
* config/s390/s390-opts.h (enum processor_type): Rename
PROCESSOR_ARCH14 to PROCESSOR_3931_Z16.
* config/s390/s390.cc (PROCESSOR_ARCH14): Rename to ...
(PROCESSOR_3931_Z16): ... throughout the file.
(s390_processor processor_table): Add z16 as cpu string.
* config/s390/s390.h (enum processor_flags): Rename PF_ARCH14 to
PF_Z16.
(TARGET_CPU_ARCH14): Rename to ...
(TARGET_CPU_Z16): ... this.
(TARGET_CPU_ARCH14_P): Rename to ...
(TARGET_CPU_Z16_P): ... this.
(TARGET_ARCH14): Rename to ...
(TARGET_Z16): ... this.
(TARGET_ARCH14_P): Rename to ...
(TARGET_Z16_P): ... this.
* config/s390/s390.md (cpu_facility): Rename arch14 to z16 and
check TARGET_Z16 instead of TARGET_ARCH14.
* config/s390/s390.opt: Add z16 to processor_type.
* doc/invoke.texi: Document z16 and arch14.
The following testcase fails to assemble due to clgte %r6,0(%r1,%r10)
insn not being accepted by assembler.
My rough understanding is that in the RSY-b insn format the spot
in other formats used for index registers is used instead for M3 what
kind of comparison it is, so this patch follows what other similar
instructions use for constraint (i.e. one without index register).
2022-03-07 Jakub Jelinek <jakub@redhat.com>
PR target/104775
* config/s390/s390.md (*cmp_and_trap_unsigned_int<mode>): Use
S constraint instead of T in the last alternative.
* gcc.target/s390/pr104775.c: New test.
This patch changes the costs for a load on condition from 5 to 6 in
order to ensure that we only if-convert two and not three or more SETS like
if (cond)
{
a = b;
c = d;
e = f;
}
In the movqicc expander we emit a paradoxical subreg directly that
combine would otherwise try to create by using a non-optimal sequence
(which would be too expensive).
Also, fix two oversights in ifcvt testcases.
gcc/ChangeLog:
* config/s390/s390.cc (s390_rtx_costs): Increase costs for load
on condition.
* config/s390/s390.md: Use paradoxical subreg.
gcc/testsuite/ChangeLog:
* gcc.target/s390/ifcvt-two-insns-int.c: Fix array size.
* gcc.target/s390/ifcvt-two-insns-long.c: Dito.
For a peephole2 condition variable insn points to the first matched
insn. In order to refer to the second matched insn use
peep2_next_insn(1) instead.
gcc/ChangeLog:
* config/s390/s390.md (define_peephole2): Variable insn points
to the first matched insn. Use peep2_next_insn(1) to refer to
the second matched insn.
gcc/testsuite/ChangeLog:
* gcc.target/s390/20211119.c: New test.
Since a recent enhancement of -Waddress a couple of warnings are emitted
and turned into errors during bootstrap:
gcc/config/s390/s390.md:12087:25: error: the address of 'operands' will never be NULL [-Werror=address]
12087 | "TARGET_HTM && operands != NULL
build/gencondmd.c:59:12: note: 'operands' declared here
59 | extern rtx operands[];
| ^~~~~~~~
Fixed by removing those non-null checks.
gcc/ChangeLog:
* config/s390/s390.md ("*cc_to_int", "tabort", "*tabort_1",
"*tabort_1_plus"): Remove operands non-null check.
The patch gets rid of the unspec used for the vector permute double
immediate instruction and replaces it with generic rtx.
gcc/ChangeLog:
* config/s390/s390.md (UNSPEC_VEC_PERMI): Remove constant
definition.
* config/s390/vector.md (*vpdi1<mode>, *vpdi4<mode>): New pattern
definitions.
* config/s390/vx-builtins.md (*vec_permi<mode>): Emit generic rtx
instead of an unspec.
gcc/testsuite/ChangeLog:
* gcc.target/s390/zvector/vec-permi.c: Removed.
* gcc.target/s390/zvector/vec_permi.c: New test.
This patch gets rid of the unspecs we were using for the vector merge
instruction and replaces it with generic rtx.
gcc/ChangeLog:
* config/s390/s390-modes.def: Add more vector modes to support
concatenation of two vectors.
* config/s390/s390-protos.h (s390_expand_merge_perm_const): Add
prototype.
(s390_expand_merge): Likewise.
* config/s390/s390.c (s390_expand_merge_perm_const): New function.
(s390_expand_merge): New function.
* config/s390/s390.md (UNSPEC_VEC_MERGEH, UNSPEC_VEC_MERGEL):
Remove constant definitions.
* config/s390/vector.md (V_HW_2): Add mode iterators.
(VI_HW_4, V_HW_4): Rename VI_HW_4 to V_HW_4.
(vec_2x_nelts, vec_2x_wide): New mode attributes.
(*vmrhb, *vmrlb, *vmrhh, *vmrlh, *vmrhf, *vmrlf, *vmrhg, *vmrlg):
New pattern definitions.
(vec_widen_umult_lo_<mode>, vec_widen_umult_hi_<mode>)
(vec_widen_smult_lo_<mode>, vec_widen_smult_hi_<mode>)
(vec_unpacks_lo_v4sf, vec_unpacks_hi_v4sf, vec_unpacks_lo_v2df)
(vec_unpacks_hi_v2df): Adjust expanders to emit non-unspec RTX for
vec merge.
* config/s390/vx-builtins.md (V_HW_4): Remove mode iterator. Now
in vector.md.
(vec_mergeh<mode>, vec_mergel<mode>): Use s390_expand_merge to
emit vec merge pattern.
gcc/testsuite/ChangeLog:
* gcc.target/s390/vector/long-double-asm-in-out-hard-fp-reg.c:
Instead of vpdi with 0 and 5 vmrlg and vmrhg are used now.
* gcc.target/s390/vector/long-double-asm-inout-hard-fp-reg.c: Likewise.
* gcc.target/s390/zvector/vec-types.h: New test.
* gcc.target/s390/zvector/vec_merge.c: New test.
This helps with generating code for kernel hotpatches, which contain
individual functions and are loaded more than 2G away from vmlinux.
This should not create performance regressions for the normal use
cases, because for local functions ld replaces @PLT calls with direct
calls.
gcc/ChangeLog:
* config/s390/predicates.md (bras_sym_operand): Accept all
functions in 64-bit mode, use UNSPEC_PLT31.
(larl_operand): Use UNSPEC_PLT31.
* config/s390/s390.c (s390_loadrelative_operand_p): Likewise.
(legitimize_pic_address): Likewise.
(s390_emit_tls_call_insn): Mark __tls_get_offset as function,
use UNSPEC_PLT31.
(s390_delegitimize_address): Use UNSPEC_PLT31.
(s390_output_addr_const_extra): Likewise.
(print_operand): Add @PLT to TLS calls, handle %K.
(s390_function_profiler): Mark __fentry__/_mcount as function,
use %K, use UNSPEC_PLT31.
(s390_output_mi_thunk): Use only UNSPEC_GOT, use %K.
(s390_emit_call): Use UNSPEC_PLT31.
(s390_emit_tpf_eh_return): Mark __tpf_eh_return as function.
* config/s390/s390.md (UNSPEC_PLT31): Rename from UNSPEC_PLT.
(*movdi_64): Use %K.
(reload_base_64): Likewise.
(*sibcall_brc): Likewise.
(*sibcall_brcl): Likewise.
(*sibcall_value_brc): Likewise.
(*sibcall_value_brcl): Likewise.
(*bras): Likewise.
(*brasl): Likewise.
(*bras_r): Likewise.
(*brasl_r): Likewise.
(*bras_tls): Likewise.
(*brasl_tls): Likewise.
(main_base_64): Likewise.
(reload_base_64): Likewise.
(@split_stack_call<mode>): Likewise.
gcc/testsuite/ChangeLog:
* g++.dg/ext/visibility/noPLT.C: Skip on s390x.
* g++.target/s390/mi-thunk.C: New test.
* gcc.target/s390/nodatarel-1.c: Move foostatic to the new
tests.
* gcc.target/s390/pr80080-4.c: Allow @PLT suffix.
* gcc.target/s390/risbg-ll-3.c: Likewise.
* gcc.target/s390/call.h: Common code for the new tests.
* gcc.target/s390/call-z10-pic-nodatarel.c: New test.
* gcc.target/s390/call-z10-pic.c: New test.
* gcc.target/s390/call-z10.c: New test.
* gcc.target/s390/call-z9-pic-nodatarel.c: New test.
* gcc.target/s390/call-z9-pic.c: New test.
* gcc.target/s390/call-z9.c: New test.
* gcc.target/s390/mfentry-m64-pic.c: New test.
* gcc.target/s390/tls.h: Common code for the new TLS tests.
* gcc.target/s390/tls-pic.c: New test.
* gcc.target/s390/tls.c: New test.
Since commit dd1ef00c45 ("Fix bug in the define_subst handling that
made match_scratch unusable for multi-alternative patterns.") the
workaround for that bug in *ashrdi3_31<setcc><cconly> is not only no
longer necessary, but actually breaks the build.
Get rid of it by using only one alternative in (match_scratch). It
will be replicated as many times as needed in order to match the
pattern with which (define_subst) is used.
gcc/ChangeLog:
* config/s390/s390.md(*ashrdi3_31<setcc><cconly>): Use a single
constraint.
* config/s390/subst.md(cconly_subst): Use a single constraint
in (match_scratch).
gcc/testsuite/ChangeLog:
* gcc.target/s390/ashr.c: New test.
The probe pattern uses Pmode but the middle-end wants to emit a
word_mode probe check. This - as usual - breaks on Z with -m31
-mzarch were word_mode doesn't match Pmode.
gcc/ChangeLog:
* config/s390/s390.md ("@probe_stack2<mode>"): Change mode
iterator to W.
gcc/testsuite/ChangeLog:
* gcc.target/s390/stack-clash-4.c: New test.
Currently GCC loads large immediates into GPRs from the literal pool,
which is not as efficient as loading two halves with llihf and oilf.
gcc/ChangeLog:
2020-11-30 Ilya Leoshkevich <iii@linux.ibm.com>
* config/s390/s390-protos.h (s390_const_int_pool_entry_p): New
function.
* config/s390/s390.c (s390_const_int_pool_entry_p): New
function.
* config/s390/s390.md: Add define_peephole2 that produces llihf
and oilf.
gcc/testsuite/ChangeLog:
2020-11-30 Ilya Leoshkevich <iii@linux.ibm.com>
* gcc.target/s390/load-imm64-1.c: New test.
* gcc.target/s390/load-imm64-2.c: New test.
This patch puts absolute jump tables into a relocatable read-only section
if they are on ELF target and relocation is supported.
gcc/ChangeLog:
* final.c (final_scan_insn_1): Set jump table relocatable as the
second argument of targetm.asm_out.function_rodata_section.
* output.h (default_function_rodata_section,
default_no_function_rodata_section): Add the second argument to the
declarations.
* target.def (function_rodata_section): Change the doc and add
the second argument.
* doc/tm.texi: Regenerate.
* varasm.c (jumptable_relocatable): Implement.
(default_function_rodata_section): Add the second argument
and the support for relocatable read only sections.
(default_no_function_rodata_section): Add the second argument.
(function_mergeable_rodata_prefix): Set the second argument to false.
* config/mips/mips.c (mips_function_rodata_section): Add the second
arugment and set it to false.
* config/s390/s390.c (targetm.asm_out.function_rodata_section): Set
the second argument to false.
* config/s390/s390.md: Likewise.
On z14+, there are instructions for working with 128-bit floats (long
doubles) in vector registers. It's beneficial to use them instead of
instructions that operate on floating point register pairs, because it
allows to store 4 times more data in registers at a time, relieving
register pressure. The raw performance of the new instructions is
almost the same as that of the new ones.
Implement by storing TFmode values in vector registers on z14+. Since
not all operations are available with the new instructions, keep the
old ones available using the new FPRX2 mode, and convert between it and
TFmode when necessary (this is called "forwarder" expanders below).
Change the existing TFmode expanders to call either new- or old-style
ones depending on whether we are on z14+ or older machines
("dispatcher" expanders).
gcc/ChangeLog:
2020-11-03 Ilya Leoshkevich <iii@linux.ibm.com>
* config/s390/s390-modes.def (FPRX2): New mode.
* config/s390/s390-protos.h (s390_fma_allowed_p): New function.
* config/s390/s390.c (s390_fma_allowed_p): Likewise.
(s390_build_signbit_mask): Support 128-bit masks.
(print_operand): Support printing the second word of a TFmode
operand as vector register.
(constant_modes): Add FPRX2mode.
(s390_class_max_nregs): Return 1 for TFmode on z14+.
(s390_is_fpr128): New function.
(s390_is_vr128): Likewise.
(s390_can_change_mode_class): Use s390_is_fpr128 and
s390_is_vr128 in order to determine whether mode refers to a FPR
pair or to a VR.
(s390_emit_compare): Force TFmode operands into registers on
z14+.
* config/s390/s390.h (HAVE_TF): New macro.
(EXPAND_MOVTF): New macro.
(EXPAND_TF): Likewise.
* config/s390/s390.md (PFPO_OP_TYPE_FPRX2): PFPO_OP_TYPE_TF
alias.
(ALL): Add FPRX2.
(FP_ALL): Add FPRX2 for z14+, restrict TFmode to z13-.
(FP): Likewise.
(FP_ANYTF): New mode iterator.
(BFP): Add FPRX2 for z14+, restrict TFmode to z13-.
(TD_TF): Likewise.
(xde): Add FPRX2.
(nBFP): Likewise.
(nDFP): Likewise.
(DSF): Likewise.
(DFDI): Likewise.
(SFSI): Likewise.
(DF): Likewise.
(SF): Likewise.
(fT0): Likewise.
(bt): Likewise.
(_d): Likewise.
(HALF_TMODE): Likewise.
(tf_fpr): New mode_attr.
(type): New mode_attr.
(*cmp<mode>_ccz_0): Use type instead of mode with fsimp.
(*cmp<mode>_ccs_0_fastmath): Likewise.
(*cmptf_ccs): New pattern for wfcxb.
(*cmptf_ccsfps): New pattern for wfkxb.
(mov<mode>): Rename to mov<mode><tf_fpr>.
(signbit<mode>2): Rename to signbit<mode>2<tf_fpr>.
(isinf<mode>2): Renamed to isinf<mode>2<tf_fpr>.
(*TDC_insn_<mode>): Use type instead of mode with fsimp.
(fixuns_trunc<FP:mode><GPR:mode>2): Rename to
fixuns_trunc<FP:mode><GPR:mode>2<FP:tf_fpr>.
(fix_trunctf<mode>2): Rename to fix_trunctf<mode>2_fpr.
(floatdi<mode>2): Rename to floatdi<mode>2<tf_fpr>, use type
instead of mode with itof.
(floatsi<mode>2): Rename to floatsi<mode>2<tf_fpr>, use type
instead of mode with itof.
(*floatuns<GPR:mode><FP:mode>2): Use type instead of mode for
itof.
(floatuns<GPR:mode><FP:mode>2): Rename to
floatuns<GPR:mode><FP:mode>2<tf_fpr>.
(trunctf<mode>2): Rename to trunctf<mode>2_fpr, use type instead
of mode with fsimp.
(extend<DSF:mode><BFP:mode>2): Rename to
extend<DSF:mode><BFP:mode>2<BFP:tf_fpr>.
(<FPINT:fpint_name><BFP:mode>2): Rename to
<FPINT:fpint_name><BFP:mode>2<BFP:tf_fpr>, use type instead of
mode with fsimp.
(rint<BFP:mode>2): Rename to rint<BFP:mode>2<BFP:tf_fpr>, use
type instead of mode with fsimp.
(<FPINT:fpint_name><DFP:mode>2): Use type instead of mode for
fsimp.
(rint<DFP:mode>2): Likewise.
(trunc<BFP:mode><DFP_ALL:mode>2): Rename to
trunc<BFP:mode><DFP_ALL:mode>2<BFP:tf_fpr>.
(trunc<DFP_ALL:mode><BFP:mode>2): Rename to
trunc<DFP_ALL:mode><BFP:mode>2<BFP:tf_fpr>.
(extend<BFP:mode><DFP_ALL:mode>2): Rename to
extend<BFP:mode><DFP_ALL:mode>2<BFP:tf_fpr>.
(extend<DFP_ALL:mode><BFP:mode>2): Rename to
extend<DFP_ALL:mode><BFP:mode>2<BFP:tf_fpr>.
(add<mode>3): Rename to add<mode>3<tf_fpr>, use type instead of
mode with fsimp.
(*add<mode>3_cc): Use type instead of mode with fsimp.
(*add<mode>3_cconly): Likewise.
(sub<mode>3): Rename to sub<mode>3<tf_fpr>, use type instead of
mode with fsimp.
(*sub<mode>3_cc): Use type instead of mode with fsimp.
(*sub<mode>3_cconly): Likewise.
(mul<mode>3): Rename to mul<mode>3<tf_fpr>, use type instead of
mode with fsimp.
(fma<mode>4): Restrict using s390_fma_allowed_p.
(fms<mode>4): Restrict using s390_fma_allowed_p.
(div<mode>3): Rename to div<mode>3<tf_fpr>, use type instead of
mode with fdiv.
(neg<mode>2): Rename to neg<mode>2<tf_fpr>.
(*neg<mode>2_cc): Use type instead of mode with fsimp.
(*neg<mode>2_cconly): Likewise.
(*neg<mode>2_nocc): Likewise.
(*neg<mode>2): Likeiwse.
(abs<mode>2): Rename to abs<mode>2<tf_fpr>, use type instead of
mode with fdiv.
(*abs<mode>2_cc): Use type instead of mode with fsimp.
(*abs<mode>2_cconly): Likewise.
(*abs<mode>2_nocc): Likewise.
(*abs<mode>2): Likewise.
(*negabs<mode>2_cc): Likewise.
(*negabs<mode>2_cconly): Likewise.
(*negabs<mode>2_nocc): Likewise.
(*negabs<mode>2): Likewise.
(sqrt<mode>2): Rename to sqrt<mode>2<tf_fpr>, use type instead
of mode with fsqrt.
(cbranch<mode>4): Use FP_ANYTF instead of FP.
(copysign<mode>3): Rename to copysign<mode>3<tf_fpr>, use type
instead of mode with fsimp.
* config/s390/s390.opt (flag_vx_long_double_fma): New
undocumented option.
* config/s390/vector.md (V_HW): Add TF for z14+.
(V_HW2): Likewise.
(VFT): Likewise.
(VF_HW): Likewise.
(V_128): Likewise.
(tf_vr): New mode_attr.
(tointvec): Add TF.
(mov<mode>): Rename to mov<mode><tf_vr>.
(movetf): New dispatcher.
(*vec_tf_to_v1tf): Rename to *vec_tf_to_v1tf_fpr, restrict to
z13-.
(*vec_tf_to_v1tf_vr): New pattern for z14+.
(*fprx2_to_tf): Likewise.
(*mov_tf_to_fprx2_0): Likewise.
(*mov_tf_to_fprx2_1): Likewise.
(add<mode>3): Rename to add<mode>3<tf_vr>.
(addtf3): New dispatcher.
(sub<mode>3): Rename to sub<mode>3<tf_vr>.
(subtf3): New dispatcher.
(mul<mode>3): Rename to mul<mode>3<tf_vr>.
(multf3): New dispatcher.
(div<mode>3): Rename to div<mode>3<tf_vr>.
(divtf3): New dispatcher.
(sqrt<mode>2): Rename to sqrt<mode>2<tf_vr>.
(sqrttf2): New dispatcher.
(fma<mode>4): Restrict using s390_fma_allowed_p.
(fms<mode>4): Likewise.
(neg_fma<mode>4): Likewise.
(neg_fms<mode>4): Likewise.
(neg<mode>2): Rename to neg<mode>2<tf_vr>.
(negtf2): New dispatcher.
(abs<mode>2): Rename to abs<mode>2<tf_vr>.
(abstf2): New dispatcher.
(float<mode>tf2_vr): New forwarder.
(float<mode>tf2): New dispatcher.
(floatuns<mode>tf2_vr): New forwarder.
(floatuns<mode>tf2): New dispatcher.
(fix_trunctf<mode>2_vr): New forwarder.
(fix_trunctf<mode>2): New dispatcher.
(fixuns_trunctf<mode>2_vr): New forwarder.
(fixuns_trunctf<mode>2): New dispatcher.
(<FPINT:fpint_name><VF_HW:mode>2<VF_HW:tf_vr>): New pattern.
(<FPINT:fpint_name>tf2): New forwarder.
(rint<mode>2<tf_vr>): New pattern.
(rinttf2): New forwarder.
(*trunctfdf2_vr): New pattern.
(trunctfdf2_vr): New forwarder.
(trunctfdf2): New dispatcher.
(trunctfsf2_vr): New forwarder.
(trunctfsf2): New dispatcher.
(extenddftf2_vr): New pattern.
(extenddftf2): New dispatcher.
(extendsftf2_vr): New forwarder.
(extendsftf2): New dispatcher.
(signbittf2_vr): New forwarder.
(signbittf2): New dispatchers.
(isinftf2_vr): New forwarder.
(isinftf2): New dispatcher.
* config/s390/vx-builtins.md (*vftci<mode>_cconly): Use VF_HW
instead of VECF_HW, add missing constraint, add vw support.
(vftci<mode>_intcconly): Use VF_HW instead of VECF_HW.
(*vftci<mode>): Rename to vftci<mode>, use VF_HW instead of
VECF_HW, and vw support.
(vftci<mode>_intcc): Use VF_HW instead of VECF_HW.
This patch enables a peephole2 optimization which transforms a load of
constant zero into a temporary register which is then finally used to
compare against a floating-point register of interest into a single load
and test instruction. However, the optimization is only applied if both
registers are dead afterwards and if we test for (in)equality only.
This is relaxed in case of fast math.
This is a follow up to PR88856.
gcc/ChangeLog:
* config/s390/s390.md ("*cmp<mode>_ccs_0", "*cmp<mode>_ccz_0",
"*cmp<mode>_ccs_0_fastmath"): Basically change "*cmp<mode>_ccs_0" into
"*cmp<mode>_ccz_0" and for fast math add "*cmp<mode>_ccs_0_fastmath".
gcc/testsuite/ChangeLog:
* gcc.target/s390/load-and-test-fp-1.c: Change test to include all
possible combinations of dead/live registers and comparisons (equality,
relational).
* gcc.target/s390/load-and-test-fp-2.c: Same as load-and-test-fp-1.c
but for fast math.
* gcc.target/s390/load-and-test-fp.h: New test included by
load-and-test-fp-{1,2}.c.
In s390_expand_insv the movstrict patterns are always generated with a
CC clobber although only movstricthi actually needs one. The patch
invokes the expanders instead of constructing the pattern by hand.
Bootstrapped and regression tested on s390x.
gcc/ChangeLog:
PR target/96127
* config/s390/s390.c (s390_expand_insv): Invoke the movstrict
expanders to generate the pattern.
* config/s390/s390.md ("*movstricthi", "*movstrictqi"): Remove the
'*' to have callable expanders.
gcc/testsuite/ChangeLog:
PR target/96127
* gcc.target/s390/pr96127.c: New test.
Probes emitted by the common code routines still use a store. Define
the "probe_stack" pattern to use a compare instead.
gcc/ChangeLog:
2020-05-14 Andreas Krebbel <krebbel@linux.ibm.com>
* config/s390/s390.c (s390_emit_stack_probe): Call the probe_stack
expander.
* config/s390/s390.md ("@probe_stack2<mode>", "probe_stack"): New
expanders.
gcc/testsuite/ChangeLog:
2020-05-14 Andreas Krebbel <krebbel@linux.ibm.com>
* gcc.target/s390/stack-clash-2.c: New test.
When compiling with -mbackchain -fstack-clash-protection currently no
probes are emitted. This patch adjusts the "allocate_stack" expander
to call anti_adjust_stack_and_probe_stack_clash when needed. In order
to do this I had to export that function from explow.c.
Ok for mainline?
gcc/ChangeLog:
2020-05-14 Andreas Krebbel <krebbel@linux.ibm.com>
* config/s390/s390.md ("allocate_stack"): Call
anti_adjust_stack_and_probe_stack_clash when stack clash
protection is enabled.
* explow.c (anti_adjust_stack_and_probe_stack_clash): Remove
prototype. Remove static.
* explow.h (anti_adjust_stack_and_probe_stack_clash): Add
prototype.
gcc/testsuite/ChangeLog:
2020-05-14 Andreas Krebbel <krebbel@linux.ibm.com>
* gcc.target/s390/stack-clash-3.c: New test.
gcc/ChangeLog:
2020-03-06 Andreas Krebbel <krebbel@linux.ibm.com>
* config/s390/s390.md ("tabort"): Get rid of two consecutive
blanks in format string.
The following testcase started to ICE when .POPCOUNT matching has been added
to match.pd; we had __builtin_popcount*, but nothing would use the
popcounthi2 expander before.
The problem is that the popcounthi2_z196 expander doesn't emit valid RTL:
error: unrecognizable insn:
(insn 138 137 139 27 (set (reg:SI 190)
(ashift:SI (reg:HI 95 [ _105 ])
(const_int 8 [0x8]))) -1
(nil))
during RTL pass: vregs
The following patch is an attempt to fix that, furthermore I've tried to
slightly simplify it as well, it makes no sense to me to perform
(x + (x << 8)) >> 8 when we need to either zero extend or mask the result
at the end in order to avoid bits from above HImode to affect it, when we
can do
(x + (x >> 8)) & 0xff (or zero extension).
2020-02-03 Jakub Jelinek <jakub@redhat.com>
PR target/93533
* config/s390/s390.md (popcounthi2_z196): Fix up expander to emit
valid RTL to sum up the lowest and second lowest bytes of the popcnt
result.
* gcc.c-torture/compile/pr93533.c: New test.
* gcc.target/s390/pr93533.c: New test.
The RTXs used to express an overflow condition check in add/sub/mul are
too complex for if conversion. However, there is code in
noce_emit_store_flag which generates a simple CC compare as the base
for using a conditional load. All we have to do is to provide a
pattern to store the truth value of a CC compare into a GPR.
Done with the attached patch.
2019-11-07 Andreas Krebbel <krebbel@linux.ibm.com>
* config/s390/s390.md ("*cstorecc<mode>_z13"): New insn_and_split
pattern.
gcc/testsuite/ChangeLog:
2019-11-07 Andreas Krebbel <krebbel@linux.ibm.com>
* gcc.target/s390/addsub-signed-overflow-1.c: Expect lochi
instructions to be used.
* gcc.target/s390/addsub-signed-overflow-2.c: Likewise.
* gcc.target/s390/mul-signed-overflow-1.c: Likewise.
* gcc.target/s390/mul-signed-overflow-2.c: Likewise.
* gcc.target/s390/vector/vec-scalar-cmp-1.c: Check for 32 and 64
bit variant of lochi. Swap the values for the lochi's.
* gcc.target/s390/zvector/vec-cmp-1.c: Likewise.
From-SVN: r277922
dg-torture.exp=inf-compare-1.c is failing, because (qNaN > +Inf)
comparison is compiled to CDB instruction, which does not signal an
invalid operation exception. KDB should have been used instead.
This patch introduces a new CCmode and a new pattern in order to
generate signaling instructions in this and similar cases.
gcc/ChangeLog:
2019-10-11 Ilya Leoshkevich <iii@linux.ibm.com>
PR target/77918
* config/s390/2827.md: Add new opcodes.
* config/s390/2964.md: Likewise.
* config/s390/3906.md: Likewise.
* config/s390/8561.md: Likewise.
* config/s390/s390-builtins.def (s390_vfchesb): Use
the new vec_cmpgev4sf_quiet_nocc.
(s390_vfchedb): Use the new vec_cmpgev2df_quiet_nocc.
(s390_vfchsb): Use the new vec_cmpgtv4sf_quiet_nocc.
(s390_vfchdb): Use the new vec_cmpgtv2df_quiet_nocc.
(vec_cmplev4sf): Use the new vec_cmplev4sf_quiet_nocc.
(vec_cmplev2df): Use the new vec_cmplev2df_quiet_nocc.
(vec_cmpltv4sf): Use the new vec_cmpltv4sf_quiet_nocc.
(vec_cmpltv2df): Use the new vec_cmpltv2df_quiet_nocc.
* config/s390/s390-modes.def (CCSFPS): New mode.
* config/s390/s390.c (s390_match_ccmode_set): Support CCSFPS.
(s390_select_ccmode): Return CCSFPS for LT, LE, GT, GE and LTGT.
(s390_branch_condition_mask): Reuse CCS for CCSFPS.
(s390_expand_vec_compare): Use non-signaling patterns where
necessary.
(s390_reverse_condition): Support CCSFPS.
* config/s390/s390.md (*cmp<mode>_ccsfps): New pattern.
* config/s390/vector.md: (VFCMP_HW_OP): Remove.
(asm_fcmp_op): Likewise.
(*smaxv2df3_vx): Use pattern for quiet comparison.
(*sminv2df3_vx): Likewise.
(*vec_cmp<VFCMP_HW_OP:code><mode>_nocc): Remove.
(*vec_cmpeq<mode>_quiet_nocc): New pattern.
(vec_cmpgt<mode>_quiet_nocc): Likewise.
(vec_cmplt<mode>_quiet_nocc): New expander.
(vec_cmpge<mode>_quiet_nocc): New pattern.
(vec_cmple<mode>_quiet_nocc): New expander.
(*vec_cmpeq<mode>_signaling_nocc): New pattern.
(*vec_cmpgt<mode>_signaling_nocc): Likewise.
(*vec_cmpgt<mode>_signaling_finite_nocc): Likewise.
(*vec_cmpge<mode>_signaling_nocc): Likewise.
(*vec_cmpge<mode>_signaling_finite_nocc): Likewise.
(vec_cmpungt<mode>): New expander.
(vec_cmpunge<mode>): Likewise.
(vec_cmpuneq<mode>): Use quiet patterns.
(vec_cmpltgt<mode>): Allow only on z14+.
(vec_cmpordered<mode>): Use quiet patterns.
(vec_cmpunordered<mode>): Likewise.
(VEC_CMP_EXPAND): Add ungt and unge.
gcc/testsuite/ChangeLog:
2019-10-11 Ilya Leoshkevich <iii@linux.ibm.com>
* gcc.target/s390/vector/vec-scalar-cmp-1.c: Adjust
expectations.
From-SVN: r276871
So far z15 was identified as arch13. After the machine has been
announced we can now add the real name.
gcc/ChangeLog:
2019-10-10 Andreas Krebbel <krebbel@linux.ibm.com>
* common/config/s390/s390-common.c (PF_ARCH13): Rename to...
(PF_Z15): ... this.
* config.gcc: Add z15 as option for --with-arch and --with-tune
configure switches.
* config/s390/s390-c.c (s390_resolve_overloaded_builtin): Add
error reporting for unsupported builtins.
* config/s390/s390-opts.h (enum processor_type): Rename
PROCESSOR_8561_ARCH13 to PROCESSOR_8561_Z15.
* config/s390/8561.md: Rename arch13 to z15 throughout the file.
* config/s390/driver-native.c (s390_host_detect_local_cpu):
Likewise.
* config/s390/s390-builtins.def: Likewise.
* config/s390/s390.c (processor_table): Add z15 as option and keep arch13 as alternative.
(s390_expand_builtin): Add missing check for unsupported builtins.
(s390_canonicalize_comparison): Rename TARGET_ARCH13 to TARGET_Z15.
(s390_rtx_costs): Likewise.
(s390_get_sched_attrmask): Rename arch13 to z15.
(s390_get_unit_mask): Likewise.
(s390_is_fpd): Likewise.
(s390_is_fxd): Likewise.
* config/s390/s390.h (enum processor_flags): Likewise.
* config/s390/s390.md: Likewise.
* config/s390/vector.md: Likewise.
* config/s390/vx-builtins.md: Likewise.
* config/s390/s390.opt: Add z15 to processor_type value.
From-SVN: r276792
For the call to __morestack we use a special ABI in the S/390 back-end
which requires us to emit a parameter block to the .rodata section.
It contains the label whereto __morestack needs to return. The
parameter block needs to be explicit in RTL since we also need to take
the address of it loaded into r1 in order to pass its address to
__morestack. In order to express correctly what __morestack does its
RTX also contained the return label. Hence we had the return label to
occur twice in the insn stream. This is problematic when it comes to
redirecting edges. The correlation between these two occurrences of
the label cannot be expressed so when doing a redirect only the label
in the jump RTX gets modified while the parameter block label stays as
is.
The patch avoids having two instancs of the label by merging the
parameter block generation and the __morestack call RTX into one. By
doing this I could also get rid of the unspec which was required for
the parameter block generation so far.
gcc/ChangeLog:
2019-10-10 Andreas Krebbel <krebbel@linux.ibm.com>
PR target/91035
* config/s390/s390-protos.h (s390_output_split_stack_data): Add
prototype.
* config/s390/s390.md (UNSPECV_SPLIT_STACK_DATA): Remove.
("split_stack_data", "split_stack_call")
("split_stack_call_<mode>", "split_stack_cond_call")
("split_stack_cond_call_<mode>"): Remove.
("@split_stack_call<mode>", "@split_stack_cond_call<mode>"): New
insn definition.
* config/s390/s390.c (s390_output_split_stack_data): New function.
(s390_expand_split_stack_prologue): Use the merged expander.
From-SVN: r276790
This patch implements the addv, subv, and mulv patterns for signed
integers.
gcc/ChangeLog:
2019-07-24 Andreas Krebbel <krebbel@linux.ibm.com>
* config/s390/predicates.md (addv_const_operand): New predicate.
* config/s390/s390-modes.def (CCO): New condition code mode.
* config/s390/s390.c (s390_match_ccmode_set): Handle E_CCOmode.
(s390_branch_condition_mask): Likewise.
* config/s390/s390.md ("addv<mode>4", "subv<mode>4")
("mulv<mode>4"): New expanders.
("*addv<mode>3_ccoverflow", "*addv<mode>3_ccoverflow_const")
("*subv<mode>3_ccoverflow", "*mulv<mode>3_ccoverflow"): New
pattern definitions.
gcc/testsuite/ChangeLog:
2019-07-24 Andreas Krebbel <krebbel@linux.ibm.com>
* gcc.target/s390/addsub-signed-overflow-1.c: New test.
* gcc.target/s390/addsub-signed-overflow-2.c: New test.
* gcc.target/s390/mul-signed-overflow-1.c: New test.
* gcc.target/s390/mul-signed-overflow-2.c: New test.
From-SVN: r273759
Add s390_valid_shift_count to determine the validity of a
shift-count operand. This is used to replace increasingly
complex substitutions that should have allowed address-style
shift-count handling, an and mask as well as no-op subregs
on the operand.
gcc/ChangeLog:
2019-07-08 Robin Dapp <rdapp@linux.ibm.com>
* config/s390/constraints.md: Add new jsc constraint.
* config/s390/predicates.md: New predicates.
* config/s390/s390-protos.h (s390_valid_shift_count): New function.
* config/s390/s390.c (s390_valid_shift_count): New function.
(print_shift_count_operand): Use s390_valid_shift_count.
(print_operand): Likewise.
* config/s390/s390.md: Use new predicate.
* config/s390/subst.md: Remove addr_style_op and masked_op substs.
* config/s390/vector.md: Use new predicate.
2019-07-08 Robin Dapp <rdapp@linux.ibm.com>
* gcc.target/s390/combine-rotate-modulo.c: New test.
* gcc.target/s390/combine-shift-rotate-add-mod.c: New test.
* gcc.target/s390/vector/combine-shift-vec.c: New test.
From-SVN: r273236
This patch is part of a series that fixes ambiguous attribute
uses in .md files, i.e. cases in which attributes didn't use
<ITER:ATTR> to specify an iterator, and in which <ATTR> could
have different values depending on the iterator chosen.
The vx-builtins.md part changes the choice of <mode> from the
implicit <VFCMP:mode> to an explicit <VF_HW:mode> (i.e. from the
mode of the comparison result to the mode of the operands being
compared). That seemed like the intended behaviour given later
patterns like vec_cmpeq<mode>_cc.
The use of BFP in the s390.md LNDFR pattern looks like a typo,
since the operand to (abs ...) has to have the same mode as the result.
The only effect before this series was to create some extra variants
that would never match, making it harmless apart from very minor code
bloat.
2019-07-06 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* config/s390/s390.md (*negabs<FP:mode>2_nocc): Use FP for
operand 1.
* config/s390/vx-builtins.md (*vec_cmp<insn_cmp><mode>_cconly):
Make the choice of <mode> explicit, giving...
(*vec_cmp<insn_cmp><VF_HW:mode>_cconly): ...this.
From-SVN: r273162
gcc/ChangeLog:
2019-04-02 Andreas Krebbel <krebbel@linux.ibm.com>
* config/s390/s390-builtin-types.def: New builtin function type
definitions.
* config/s390/s390-builtins.def (s390_vec_search_string_cc)
(s390_vec_search_string_until_zero_cc): New overloaded builtins.
(s390_vstrsb, s390_vstrsh, s390_vstrsf, s390_vstrszb)
(s390_vstrszh, s390_vstrszf): New low-level builtins.
* config/s390/s390.md (UNSPEC_VEC_VSTRS, UNSPEC_VEC_VSTRSCC): New
constant definitions.
* config/s390/vecintrin.h (vec_search_string_cc)
(vec_search_string_until_zero_cc): New builtin name definitions.
* config/s390/vx-builtins.md ("vstrs<mode>", "vstrsz<mode>"): New
expanders.
("vec_vstrs<mode>"): New insn definition.
gcc/testsuite/ChangeLog:
2019-04-02 Andreas Krebbel <krebbel@linux.ibm.com>
* gcc.target/s390/zvector/vec-search-string-cc-1.c: New test.
* gcc.target/s390/zvector/vec-search-string-cc-compile.c: New test.
* gcc.target/s390/zvector/vec-search-string-until-zero-cc-1.c: New test.
* gcc.target/s390/zvector/vec-search-string-until-zero-cc-compile.c: New test.
From-SVN: r270090
gcc/ChangeLog:
2019-04-02 Andreas Krebbel <krebbel@linux.ibm.com>
* config/s390/s390-builtin-types.def: Add new builtin function type.
* config/s390/s390-builtins.def: Add overloaded builtin
s390_vec_reve and low-level builtins for s390_vler and s390_vster.
* config/s390/s390.md (UNSPEC_VEC_ELTSWAP): New constant definition.
* config/s390/vecintrin.h (vec_reve): New builtin name definition.
* config/s390/vx-builtins.md (V_HW_HSD): New mode iterator.
("eltswap<mode>"): New expander.
("*eltswapv16qi", "*eltswap<mode>", "*eltswap<mode>_emu"): New
insn definitions.
gcc/testsuite/ChangeLog:
2019-04-02 Andreas Krebbel <krebbel@linux.ibm.com>
* gcc.target/s390/zvector/vec-reve-load-byte-z14.c: New test.
* gcc.target/s390/zvector/vec-reve-load-byte.c: New test.
* gcc.target/s390/zvector/vec-reve-load-halfword-z14.c: New test.
* gcc.target/s390/zvector/vec-reve-load-halfword.c: New test.
* gcc.target/s390/zvector/vec-reve-store-byte-z14.c: New test.
* gcc.target/s390/zvector/vec-reve-store-byte.c: New test.
From-SVN: r270085
gcc/ChangeLog:
2019-04-02 Andreas Krebbel <krebbel@linux.ibm.com>
* config/s390/s390.md ("xde"): Extend mode attribute to vector
types.
* config/s390/vector.md (VX_VEC_CONV_BFP, VX_VEC_CONV_INT): New
mode iterators.
("floatv2div2df2", "floatunsv2div2df2", "fix_truncv2dfv2di2")
("fixuns_truncv2dfv2di2"): Enhance with mode iterator to also
support 32 bit fp-int conversions. Rename to ...
("float<VX_VEC_CONV_INT:mode><VX_VEC_CONV_BFP:mode>2")
("floatuns<VX_VEC_CONV_INT:mode><VX_VEC_CONV_BFP:mode>2")
("fix_trunc<VX_VEC_CONV_BFP:mode><VX_VEC_CONV_INT:mode>2")
("fixuns_trunc<VX_VEC_CONV_BFP:mode><VX_VEC_CONV_INT:mode>2"):
... to these.
gcc/testsuite/ChangeLog:
2019-04-02 Andreas Krebbel <krebbel@linux.ibm.com>
* gcc.target/s390/arch13/fp-signedint-convert-1.c: New test.
* gcc.target/s390/arch13/fp-unsignedint-convert-1.c: New test.
From-SVN: r270081
Compared to the load on condition instructions we already have the new
select instruction allows to have a THEN and and ELSE source operand -
but only for register to register loads.
gcc/ChangeLog:
2019-04-02 Andreas Krebbel <krebbel@linux.ibm.com>
* config/s390/s390.c (s390_rtx_costs): Do not add extra costs for
if-then-else constructs if we can use the select instruction.
* config/s390/s390.md ("*mov<mode>cc"): Add the new instructions.
gcc/testsuite/ChangeLog:
2019-04-02 Andreas Krebbel <krebbel@linux.ibm.com>
* gcc.target/s390/arch13/sel-1.c: New test.
From-SVN: r270080
variant.
The new arch13 popcount instruction counts bits in the entire 64 bit
register instead of just in 8 bit portions.
gcc/ChangeLog:
2019-04-02 Andreas Krebbel <krebbel@linux.ibm.com>
* config/s390/s390.md ("*popcountdi_arch13_cc")
("*popcountdi_arch13_cconly", "*popcountdi_arch13"): New insn
definition.
("*popcount<mode>", "popcountdi2", "popcountsi2", "popcounthi2"):
Append _z196 to make it ...
("*popcount<mode>_z196", "popcountdi2_z196", "popcountsi2_z196")
("popcounthi2_z196"): ... this.
("popcountdi2_z196"): Remove TARGET_64BIT from the insn condition.
("popcountdi2", "popcountsi2", "popcounthi2"): New expanders.
gcc/testsuite/ChangeLog:
2019-04-02 Andreas Krebbel <krebbel@linux.ibm.com>
* gcc.target/s390/arch13/popcount-1.c: New test.
From-SVN: r270079
Make use of the new bit operation instructions when generating code
for the arch13 level.
gcc/ChangeLog:
2019-04-02 Andreas Krebbel <krebbel@linux.ibm.com>
* config/s390/s390.c (s390_canonicalize_comparison): Convert
certain compares for arch13 in order to make use of the condition
code result produced by the new instructions.
(s390_rtx_costs): Adjust the costs for nnrk, nngrk, nork, nogrk,
nxrk, and nxgrk instruction patterns.
* config/s390/s390.md (ANDOR, bitops_name, inv_bitops_name)
(inv_no): Add new code iterator together with some attributes.
("*andc_split_<mode>"): Disable splitter for arch13.
("*<ANDOR:bitops_name>c<GPR:mode>_cc")
("*<ANDOR:bitops_name>c<GPR:mode>_cconly")
("*<ANDOR:bitops_name>c<GPR:mode>")
("*n<ANDOR:inv_bitops_name><GPR:mode>_cc")
("*n<ANDOR:inv_bitops_name><mode>_cconly")
("*n<ANDOR:inv_bitops_name><mode>", "*nxor<GPR:mode>_cc")
("*nxor<mode>_cconly", "*nxor<mode>"): New insn definitions.
gcc/testsuite/ChangeLog:
2019-04-02 Andreas Krebbel <krebbel@linux.ibm.com>
* gcc.target/s390/arch13/bitops-1.c: New test.
* gcc.target/s390/arch13/bitops-2.c: New test.
* gcc.target/s390/md/andc-splitter-1.c: Add -march=z14 build
option and adjust line numbers.
* gcc.target/s390/md/andc-splitter-2.c: Likewise.
From-SVN: r270078
This patch enables the command line options and provides the proper
macros for checking.
gcc/ChangeLog:
2019-04-02 Andreas Krebbel <krebbel@linux.ibm.com>
* common/config/s390/s390-common.c (processor_flags_table): New
entry for arch13.
* config.gcc: Support arch13 with the --with-arch= configure flag.
* config/s390/driver-native.c (s390_host_detect_local_cpu):
* config/s390/s390-opts.h (enum processor_type): Add PROCESSOR_ARCH13.
* config/s390/s390.c (s390_get_sched_attrmask)
(s390_get_unit_mask): Add PROCESSOR_ARCH13.
* config/s390/s390.h (enum processor_flags): Add PF_VXE2 and PF_ARCH13.
* config/s390/s390.md (TARGET_CPU_ARCH13, TARGET_CPU_ARCH13_P)
(TARGET_CPU_VXE2, TARGET_CPU_VXE2_P, TARGET_ARCH13)
(TARGET_ARCH13_P, TARGET_VXE2, TARGET_VXE2_P): New macro
definitions.
* config/s390/s390.opt: Support arch13 as processor type in
command line options.
gcc/testsuite/ChangeLog:
2019-04-02 Andreas Krebbel <krebbel@linux.ibm.com>
* gcc.target/s390/s390.exp: Run tests in arch13 subdir.
* lib/target-supports.exp (check_effective_target_s390_vxe2): New
runtime check for the vxe2 hardware feature on IBM Z.
From-SVN: r270077