Commit Graph

299 Commits

Author SHA1 Message Date
H.J. Lu
97ed31ae00 x86: Optimize EVEX vector load/store instructions
When there is no write mask, we can encode lower 16 128-bit/256-bit
EVEX vector register load and store instructions as VEX vector register
load and store instructions with -O1.

gas/

	PR gas/24348
	* config/tc-i386.c (optimize_encoding): Encode 128-bit and
	256-bit EVEX vector register load/store instructions as VEX
	vector register load/store instructions for -O1.
	* doc/c-i386.texi: Update -O1 documentation.
	* testsuite/gas/i386/i386.exp: Run PR gas/24348 tests.
	* testsuite/gas/i386/optimize-1.s: Add tests for EVEX vector
	load/store instructions.
	* testsuite/gas/i386/optimize-2.s: Likewise.
	* testsuite/gas/i386/optimize-3.s: Likewise.
	* testsuite/gas/i386/optimize-5.s: Likewise.
	* testsuite/gas/i386/x86-64-optimize-2.s: Likewise.
	* testsuite/gas/i386/x86-64-optimize-3.s: Likewise.
	* testsuite/gas/i386/x86-64-optimize-4.s: Likewise.
	* testsuite/gas/i386/x86-64-optimize-5.s: Likewise.
	* testsuite/gas/i386/x86-64-optimize-6.s: Likewise.
	* testsuite/gas/i386/optimize-1.d: Updated.
	* testsuite/gas/i386/optimize-2.d: Likewise.
	* testsuite/gas/i386/optimize-3.d: Likewise.
	* testsuite/gas/i386/optimize-4.d: Likewise.
	* testsuite/gas/i386/optimize-5.d: Likewise.
	* testsuite/gas/i386/x86-64-optimize-2.d: Likewise.
	* testsuite/gas/i386/x86-64-optimize-3.d: Likewise.
	* testsuite/gas/i386/x86-64-optimize-4.d: Likewise.
	* testsuite/gas/i386/x86-64-optimize-5.d: Likewise.
	* testsuite/gas/i386/x86-64-optimize-6.d: Likewise.
	* testsuite/gas/i386/optimize-7.d: New file.
	* testsuite/gas/i386/optimize-7.s: Likewise.
	* testsuite/gas/i386/x86-64-optimize-8.d: Likewise.
	* testsuite/gas/i386/x86-64-optimize-8.s: Likewise.

opcodes/

	PR gas/24348
	* i386-opc.tbl: Add Optimize to vmovdqa32, vmovdqa64, vmovdqu8,
	vmovdqu16, vmovdqu32 and vmovdqu64.
	* i386-tbl.h: Regenerated.
2019-03-18 08:58:19 +08:00
Alan Modra
827041555a Update year range in copyright notice of binutils files 2019-01-01 22:06:53 +10:30
Jan Beulich
b50c9f3166 x86: adjust {,E}VEX.W handling for PEXTR* / PINSR*
PEXTR{B,W} and PINSR{B,W}, just like for AVX512BW, are WIG, no matter
that the SDM uses a nonstandard description of that fact.

PEXTRD, even with EVEX.W set, ignores that bit outside of 64-bit mode,
just like its AVX counterpart.
2018-11-06 11:43:55 +01:00
Jan Beulich
931d03b75a x86: adjust {,E}VEX.W handling outside of 64-bit mode
Many VEX-/EVEX-encoded instructions accessing GPRs become WIG outside of
64-bit mode. The respective templates should specify neither VexWIG nor
VexW0, but instead the setting of the bit should be determined from
- REX.W in 64-bit mode,
- the setting established through -mvexwig= / -mevexwig= otherwise.
This implies that the evex-wig2 testcase needs to go away, as being
wrong altogether.

A few test additions desirable here will only happen in later patches,
as the disassembler needs adjustments first.

Once again SSE2AVX templates are left alone, for it being unclear what
the behavior there should be.
2018-11-06 11:42:54 +01:00
Jan Beulich
fd71a3756e x86: fix various non-LIG templates
Quite a few templates were marked LIG while really the insns aren't.
Introduce descriptive shorthands once again, instead of continuing to
use the less legible original forms.
2018-11-06 11:42:08 +01:00
Jan Beulich
563c7eef61 x86: allow {store} to select alternative {,}PEXTRW encoding
The 0F C5 encoding is indeed a load type one (just that memory operands
are not permitted), while the 0F 3A 15 encoding is obviously a store.
Allow the pseudo prefixes to be used to select between them.

Also move (without any change) the secondary AVX512BW templates next to
the primary one.
2018-11-06 11:40:25 +01:00
Jan Beulich
0aaca1d90a x86: add more VexWIG
Commits 6865c0435a ("x86: Support VEX/EVEX WIG encoding") and 6fa52824c3
("x86: Replace VexW=3 with VexWIG") omitted quite a few templates, oddly
enough in some cases despite testcases getting added (which then were
recorded with wrong expected output).

Also adjust VPMAXUB's attributes in the AVX512BW case to match ordering
of that of neighboring templates.

For the moment SSE2AVX templates are left alone, as it isn't clear
whether they were intentionally left untouched by the original commits
(the descriptions don't say either way).

In this context I question the decision in commit 0375113302 ("x86: Add
-mvexwig=[0|1] option to assembler") to move the logic to determine the
value of the W bit ahead of the decision whether to use 2-byte VEX:
While I can see this as one possible interpretation of -mvexwig=, the
other alternative (setting the value of the bit only if it actually
exists in the encoding) looks as reasonable to me, and perhaps even more
in line with us generally trying to pick the shortest encoding.
2018-11-06 11:39:42 +01:00
Jan Beulich
bbae6b11eb x86: XOP VPHADD* / VPHSUB* are VEX.W0
Also avoid introducing further uses of VexW=1, by introducing and using
VexW0 at this occasion. Move the marker past all #define-s.
2018-11-06 11:38:47 +01:00
Jan Beulich
673fe0f0a7 x86: fold Size{16,32,64} template attributes
Only one of them can be set at a time, which means they can be expressed
by a single 2-bit field instead of three 1-bit ones.
2018-10-10 08:41:52 +02:00
H.J. Lu
a4e78aa5fe x86: Add Intel ENCLV to assembler and disassembler
gas/

	* testsuite/gas/i386/se1.s: Add enclv.
	* testsuite/gas/i386/x86-64-se1.s: Likewise.
	* testsuite/gas/i386/se1.d: Updated.
	* testsuite/gas/i386/x86-64-se1.d: Likewise.

opcodes/

	* i386-dis.c (rm_table): Add enclv.
	* i386-opc.tbl: Add enclv.
	* i386-tbl.h: Regenerated.
2018-10-05 11:56:42 -07:00
H.J. Lu
04e2a1829e x86: Set EVex=2 on EVEX.128 only vmovd and vmovq
EVEX "VMOVD xmm1, r32/m32", "VMOVD r32/m32, xmm2", "VMOVQ xmm1, r64/m64",
"VMOVD r64/m64, xmm2", "VMOVQ xmm1, xmm2/m64" and "VMOVQ xmm1/m64, xmm2"
can only be encoded with EVEX.128.  Set EVex=2 on EVEX.128 only vmovd and
vmovq.

gas/

	PR gas/23670
	* testsuite/gas/i386/evex-lig-2.d: New file.
	* testsuite/gas/i386/evex-lig-2.s: Likewise.
	* testsuite/gas/i386/x86-64-evex-lig-2.d: Likewise.
	* testsuite/gas/i386/x86-64-evex-lig-2.s: Likewise.
	* testsuite/gas/i386/i386.exp: Run evex-lig-2 and
	x86-64-evex-lig-2.

opcodes/

	PR gas/23670
	* i386-dis-evex.h (evex_table): Use EVEX_LEN_0F6E_P_2,
	EVEX_LEN_0F7E_P_1, EVEX_LEN_0F7E_P_2 and EVEX_LEN_0FD6_P_2.
	(EVEX_LEN_0F6E_P_2): New EVEX_LEN_TABLE entry.
	(EVEX_LEN_0F7E_P_1): Likewise.
	(EVEX_LEN_0F7E_P_2): Likewise.
	(EVEX_LEN_0FD6_P_2): Likewise.
	* i386-dis.c (USE_EVEX_LEN_TABLE): New.
	(EVEX_LEN_TABLE): Likewise.
	(EVEX_LEN_0F6E_P_2): New enum.
	(EVEX_LEN_0F7E_P_1): Likewise.
	(EVEX_LEN_0F7E_P_2): Likewise.
	(EVEX_LEN_0FD6_P_2): Likewise.
	(evex_len_table): New.
	(get_valid_dis386): Handle USE_EVEX_LEN_TABLE.
	* i386-opc.tbl: Set EVex=2 on EVEX.128 only vmovd and vmovq.
	* i386-tbl.h: Regenerated.
2018-09-17 09:33:35 -07:00
H.J. Lu
d5f787c2bc x86: Set Vex=1 on VEX.128 only vmovd and vmovq
AVX "VMOVD xmm1, r32/m32", "VMOVD r32/m32, xmm2", "VMOVQ xmm1, r64/m64"
and "VMOVD r64/m64, xmm2" can only be encoded with VEX.128.  Set Vex=1
on VEX.128 only vmovd and vmovq.

gas/

	PR gas/23665
	* testsuite/gas/i386/avx-scalar.s: Remove vmovq and vmovd tests.
	* testsuite/gas/i386/x86-64-avx-scalar.s: Likewise.
	* testsuite/gas/i386/avx-scalar-intel.d: Updated.
	* testsuite/gas/i386/avx-scalar.d: Likewise.
	* testsuite/gas/i386/x86-64-avx-scalar-intel.d: Likewise.
	* testsuite/gas/i386/x86-64-avx-scalar.d: Likewise.
	* testsuite/gas/i386/i386.exp: Run avx-scalar2 and
	x86-64-avx-scalar2.
	* testsuite/gas/i386/avx-scalar-2.d: New file.
	* testsuite/gas/i386/avx-scalar-2.s: Likewise.
	* testsuite/gas/i386/x86-64-avx-scalar-2.d: Likewise.
	* testsuite/gas/i386/x86-64-avx-scalar-2.s: Likewise.

opcodes/

	PR gas/23665
	* i386-dis.c (vex_len_table): Update VEX_LEN_0F6E_P_2 and
	VEX_LEN_0F7E_P_2 entries.
	* i386-opc.tbl: Set Vex=1 on VEX.128 only vmovd and vmovq.
	* i386-tbl.h: Regenerated.
2018-09-17 09:31:17 -07:00
H.J. Lu
db4cc66567 x86: Set VexW=3 on AVX vrsqrtss
AVX vrsqrtss is a VEX WIG instruction.

	* i386-opc.tbl: Set VexW=3 on AVX vrsqrtss.
	* i386-tbl.h: Regenerated.
2018-09-15 17:10:17 -07:00
H.J. Lu
3c3741435f x86: Set Vex=1 on VEX.128 only vmovq
AVX "VMOVQ xmm1, xmm2/m64" and "VMOVQ xmm1/m64, xmm2" can only be
encoded with VEX.128.  Set Vex=1 on VEX.128 only vmovq and update
assembler tests.

gas/

	PR gas/23665
	* testsuite/gas/i386/avx-scalar-intel.d: Updated.
	* testsuite/gas/i386/avx-scalar.d: Likewise.
	* testsuite/gas/i386/x86-64-avx-scalar-intel.d: Likewise.
	* testsuite/gas/i386/x86-64-avx-scalar.d: Likewise.

opcodes/

	PR gas/23665
	* i386-dis.c (vex_len_table): Update VEX_LEN_0F7E_P_1 and
	VEX_LEN_0FD6_P_2 entries.
	* i386-opc.tbl: Set Vex=1 on VEX.128 only vmovq.
	* i386-tbl.h: Regenerated.
2018-09-15 14:50:40 -07:00
H.J. Lu
6865c0435a x86: Support VEX/EVEX WIG encoding
Add VEXWIG, defined as 3, to indicate that the VEX.W/EVEX.W bit is
ignored by such VEX/EVEX instructions, aka WIG instructions.  Set
VexW=3 on VEX/EVEX WIG instructions.  Update assembler to check
VEXWIG when setting the VEX.W bit.

gas/

	PR gas/23642
	* config/tc-i386.c (build_vex_prefix): Check VEXWIG when setting
	the VEX.W bit.
	(build_evex_prefix): Check VEXWIG when setting the EVEX.W bit.

opcodes/

	PR gas/23642
	* i386-opc.h (VEXWIG): New.
	* i386-opc.tbl: Set VexW=3 on VEX/EVEX WIG instructions.
	* i386-tbl.h: Regenerated.
2018-09-14 12:20:10 -07:00
Jan Beulich
556059dd13 x86: fold CRC32 templates
Just like other insns having byte and word forms, these can also make
use of the W modifier, which at the same time allows simplifying some
other code a little bit.
2018-09-14 11:21:15 +02:00
H.J. Lu
5be12fc1ad x86: Remove VexW=1 from WIG VEX movq and vmovq
Put back changes lost in commit 41d1ab6a6d.
2018-09-13 07:38:45 -07:00
H.J. Lu
41d1ab6a6d i386: Update VexW field for VEX instructions
1. Mark VEX.W0 VEX instructions with VexW=1.
2. Mark VEX.W1 VEX instructions with VexW=2.
3. Remove VexW=1 from WIG VEX instructions.

	* i386-opc.tbl: Add VexW=1 to VEX.W0 VEX movd, cvtsi2ss, cvtsi2sd,
	pextrd, pinsrd, vcvtsi2sd, vcvtsi2ss, vmovd, vpextrd and vpinsrd.
	Add VexW=2 to VEX.W1 VEX movd, movq, pextrq, pinsrq, vmod, vmovq,
	vpextrq and vpinsrq.  Remove VexW=1 from WIG VEX movq and vmovq.
	* i386-tbl.h: Regenerated.
2018-09-13 06:21:19 -07:00
Jan Beulich
57f6375ec1 x86: drop bogus IgnoreSize from a few further insns 2018-09-13 11:26:06 +02:00
Jan Beulich
2589a7e59b x86: drop bogus IgnoreSize from AVX512_4* insns 2018-09-13 11:25:30 +02:00
Jan Beulich
a760eb41aa x86: drop bogus IgnoreSize from AVX512DQ insns 2018-09-13 11:24:53 +02:00
Jan Beulich
e90426589d x86: drop bogus IgnoreSize from AVX512BW insns 2018-09-13 11:24:23 +02:00
Jan Beulich
9caa306f80 x86: drop bogus IgnoreSize from AVX512VL insns 2018-09-13 11:23:50 +02:00
Jan Beulich
fb6ce599e0 x86: drop bogus IgnoreSize from AVX512ER insns 2018-09-13 11:23:17 +02:00
Jan Beulich
6a8da88669 x86: drop bogus IgnoreSize from AVX512F insns 2018-09-13 11:22:49 +02:00
Jan Beulich
c7f279191f x86: drop bogus IgnoreSize from SHA insns 2018-09-13 11:22:03 +02:00
Jan Beulich
0f407ee9f4 x86: drop bogus IgnoreSize from XOP and SSE4a insns 2018-09-13 11:21:36 +02:00
Jan Beulich
2fbbbee5e7 x86: drop bogus IgnoreSize from AVX2 insns 2018-09-13 11:19:21 +02:00
Jan Beulich
2b02b9a2ab x86: drop bogus IgnoreSize from AVX insns 2018-09-13 11:18:52 +02:00
Jan Beulich
963c68aa4a x86: drop bogus IgnoreSize from GNFI insns 2018-09-13 11:16:49 +02:00
Jan Beulich
64e025c3a1 x86: drop bogus IgnoreSize from PCLMUL/VPCLMUL insns 2018-09-13 11:16:19 +02:00
Jan Beulich
47603f888d x86: drop bogus IgnoreSize from AES/VAES insns 2018-09-13 11:15:38 +02:00
Jan Beulich
0001cfd00c x86: drop bogus IgnoreSize from SSE4.2 insns 2018-09-13 11:15:01 +02:00
Jan Beulich
be4b452e28 x86: drop bogus IgnoreSize from SSE4.1 insns 2018-09-13 11:14:32 +02:00
Jan Beulich
d09a13943b x86: drop bogus IgnoreSize from SSSE3 insns 2018-09-13 11:13:46 +02:00
Jan Beulich
07599e13ac x86: drop bogus IgnoreSize from SSE3 insns 2018-09-13 11:12:23 +02:00
Jan Beulich
1ee3e48715 x86: drop bogus IgnoreSize from SSE2 insns 2018-09-13 11:11:55 +02:00
Jan Beulich
a5f580e51a x86: drop bogus IgnoreSize from SSE insns 2018-09-13 11:11:26 +02:00
Jan Beulich
49d5d12d0e x86: drop unnecessary {,No}Rex64 2018-09-13 11:08:37 +02:00
Jan Beulich
f5eb1d70fb x86: also allow D on 3-operand insns
For now this is just for VMOVS{D,S}.
2018-09-13 11:07:55 +02:00
Jan Beulich
dbbc8b7e62 x86: use D attribute also for SIMD templates
Various moves come in load and store forms, and just like on the GPR
and FPU sides there would better be only one pattern. In some cases this
is not feasible because the opcodes are too different, but quite a few
cases follow a similar standard scheme. Introduce Opcode_SIMD_FloatD and
Opcode_SIMD_IntD, generalize handling in operand_size_match() (reverse
operand handling there simply needs to match "straight" operand one),
and fix a long standing, but so far only latent bug with when to zap
found_reverse_match.

Also once again drop IgnoreSize where pointlessly applied to templates
touched anyway as well as *word when redundant with Reg*.
2018-09-13 11:07:07 +02:00
H.J. Lu
d871f3f483 x86: Add CpuCMOV and CpuFXSR
There are separate CPUID feature bits for fxsave/fxrstor and cmovCC
instructions.  This patch adds CpuCMOV and CpuFXSR to replace Cpu686
on corresponding instructions.

gas/

	* config/tc-i386.c (cpu_arch): Add .cmov and .fxsr.
	(cpu_noarch): Add nocmov and nofxsr.
	* doc/c-i386.texi: Document cmov and fxsr.

opcodes/

	* i386-gen.c (cpu_flag_init): Add CpuCMOV and CpuFXSR to
	CPU_I686_FLAGS.  Add CPU_CMOV_FLAGS, CPU_FXSR_FLAGS,
	CPU_ANY_CMOV_FLAGS and CPU_ANY_FXSR_FLAGS.
	(cpu_flags): Add CpuCMOV and CpuFXSR.
	* i386-opc.tbl: Replace Cpu686 with CpuFXSR on fxsave, fxsave64,
	fxrstor and fxrstor64.  Replace Cpu686 with CpuCMOV on cmovCC.
	* i386-init.h: Regenerated.
	* i386-tbl.h: Likewise.
2018-08-11 14:37:32 -07:00
Jan Beulich
e968fc9b63 x86: fold RegEip/RegRip and RegEiz/RegRiz
This allows to simplify the code in a number of places.
2018-08-06 08:34:36 +02:00
Jan Beulich
dbf8be89ed x86: drop NoRex64 from {,v}pmov{s,z}x*
They're pointless with IgnoreSize also specified, and even more so when
no Qword operand exists.
2018-08-03 09:30:58 +02:00
Jan Beulich
c48dadc9a8 x86: drop "mem" operand type attribute
No template specifies this bit, so there's no point recording it in the
templates. Use a flags[] bit instead.
2018-08-03 09:30:02 +02:00
Jan Beulich
1424ad8677 x86: also optimize KXOR{D,Q} and KANDN{D,Q}
These can be converted to 2-byte VEX encoding when both source registers
are the same, by using KXORW / KANDNW as replacement.
2018-07-31 10:58:05 +02:00
Jan Beulich
ae2387feae x86: fold various AVX512 templates with so far differing Masking attributes
There's no insn allowing ZEROING_MASKING alone. Re-purpose its value for
handling the not uncommon case of insns allowing either form of masking
with register operands, but only merging masking with a memory operand.
2018-07-31 10:57:09 +02:00
Jan Beulich
6ff00b5e12 x86/Intel: correct permitted operand sizes for AVX512 scatter/gather
AVX gather insns correctly allow the element size to be specified rather
than the full vector size. Make AVX512 ones match.
2018-07-31 10:55:17 +02:00
Jan Beulich
e951d5ca3d x86: drop CpuVREX
It is fully redundant with CpuAVX512F.
2018-07-31 10:52:37 +02:00
H.J. Lu
4a1b91eabb x86: Expand Broadcast to 3 bits
Expand Broadcast to 3 bits so that the number of bytes to broadcast
can be computed as 1 << (Broadcast - 1).  Use it to simplify x86
assembler.

gas/

	* config/tc-i386.c (Broadcast_Operation): Add bytes.
	(build_evex_prefix): Use i.broadcast->bytes.
	(match_broadcast_size): New function.
	(check_VecOperands): Use the broadcast field to compute the
	number of bytes to broadcast directly.  Set i.broadcast->bytes.
	Use match_broadcast_size.

opcodes/

	* i386-gen.c (adjust_broadcast_modifier): New function.
	(process_i386_opcode_modifier): Add an argument for operands.
	Adjust the Broadcast value based on operands.
	(output_i386_opcode): Pass operand_types to
	process_i386_opcode_modifier.
	(process_i386_opcodes): Pass NULL as operands to
	process_i386_opcode_modifier.
	* i386-opc.h (BYTE_BROADCAST): New.
	(WORD_BROADCAST): Likewise.
	(DWORD_BROADCAST): Likewise.
	(QWORD_BROADCAST): Likewise.
	(i386_opcode_modifier): Expand broadcast to 3 bits.
	* i386-tbl.h: Regenerated.
2018-07-25 15:28:24 -07:00